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TARGETING AMYLOID 


Antibody aducanumab reduces Alzheimer’s 
disease—associated amyloid in human brain paces 36850 
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Zika imbalance 


The US government should not redirect vital funds to work on the Zika virus at the expense 


of other health priorities. 


money into combating the Ebola outbreak in West Africa. The 

US$5.4-billion emergency fund approved by Congress was the 
largest amount of funding ever appropriated for a single international 
health crisis. Numerous voices — including Nature’s — applauded 
the investment while warning against using the money to crush 
the outbreak and then fly back home. Experts warned that, without 
permanent improvements to Africa's health-care systems, Ebola or 
something worse would reappear. Health workers set out to use the 
funds, which were intended to last until 2019. 

But the short attention span of politicians and the public has endan- 
gered these efforts. Resources intended to address the root causes of 
epidemics are being transferred to research on Zika, the mysterious 
virus that appeared in South America in 2015. Zika is not deadly and — 
for most people — not particularly incapacitating. But it has captured 
attention as the latest global health threat. Cash-strapped agencies such 
as the US Centers for Disease Control and Prevention have run with that 
theme, as have politicians eager to slam their adversaries in Congress. 

In February, President Barack Obama called for Congress to author- 
ize $1.9 billion in emergency funds to respond to Zika, including 
improved surveillance, international aid for health care and vector con- 
trol, and research towards a vaccine. Congress refused, so in April the 
White House shifted nearly $600 million from the Ebola fund that was 
still being used to improve disease-response training and screening. 

The July announcement that mosquitoes in Florida are transmitting 
Zika has redoubled calls for action. Two weeks ago, the administration 
directed the National Institutes of Health (NIH) to move $34 million 
from its research portfolio to Zika vaccine research; $47 million will 
also be transferred from other medical-service budgets. Each NIH 
institute — even those that do not focus on infectious disease — is 


| ate in 2014, the US government finally put serious effort and 


contributing about 0.14% of its budget to the Zika effort, according 
to numbers supplied to Nature. 

This crosses a line. Even when one sets aside global scourges such as 
malaria — which affects millions of people each year and rarely draws 
strident calls for emergency funds — Zika is just one more virus that 
affects the United States. Others include West Nile virus (which has 
no approved human vaccine), dengue and chikungunya, as well as 
the seasonal and circulating influenza viruses that can kill thousands. 

Taking money from much-needed research and health care to 
develop a vaccine against one disease itself costs lives. One analysis 
estimated that redirecting money to Ebola and away from common 
infections such as malaria and HIV caused nearly as many deaths as 
Ebola itself (A. S. Parpia et al. Emerg. Infect. Dis. http://doi.org/bcqt; 
2016). Triaging disease research and funding is always a complex issue, 
and, too often, public sentiment rather than health need drives policy. 

But it doesn’t need to be that way. When Congress returns to session 
on 6 September, the US administration should insist on a permanent 
fund from which public-health agencies can draw, similar to that made 
available for natural disasters. It should also dedicate more money to 
international surveillance, detection and health-care systems — the sort 
of work that the Ebola fund is intended to support — and implement 
more stringent vector-control strategies to keep many viruses in check. 

Each plea for emergency funds underscores just how unprepared 
the United States is for major health crises. The Obama administra- 
tion has rightly called for permanent emergency funds and money for 
overall infrastructure improvement. But its willingness to sacrifice 
necessary research and development programmes to stick Band-Aids 
on the latest public-health scare erodes its credibility. When a truly 
deadly and pervasive pathogen appears in the United States, will there 
be any Band-Aids left? = 


Pachyderm plight 


Analysis highlights the threat to anewly 
distinct species of African elephant. 


that African elephants are actually two distinct species. On the 
savannah lives the huge Loxodonta africana, whereas the smaller, 
secretive Loxodonta cyclotis is found in the forests of central Africa. 
Poaching is devastating both populations, but poaching of forest 
elephants should be of particular concern. Research by George 
Wittemyer and his colleagues indicates that most females of this species 
do not become pregnant for the first time until they are 23, and they 


( “tha to common wisdom, most researchers now accept 


produce only 1 calf every 5 to 6 years (A. K. Turkalo et al. J. Appl. Ecol. 
http://dx.doi.org/10.1111/1365-2664.12764; 2016). By contrast, the 
savannah elephant begins breeding at 12 years of age, and typically pro- 
duces young at 3- to 4-year intervals. Thus, forest-elephant populations 
increase in size slowly, and are at greater risk of extinction. 

Wittemyer’s work should spur increased focus on poaching preven- 
tion, and the study is also likely to reignite debate about the failure of the 
International Union for Conservation of Nature (IUCN) to recognize 
two different African elephant species on its extinction-risk ‘red list’ 
The IUCN has shied away from splitting the animals into two groups, 
primarily over fears about what this would mean for the status of hybrids 
between savannah and forest animals (see go.nature.com/2bo5nx3). 

But the net effect of lumping the two together is to significantly under- 
estimate the vulnerability of the African forest elephant. At its conserva- 
tion congress this week, the IUCN needs to catch up with the science 
and recognize the real threat of this species’ extinction. = 
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It is privatized, decentralized — and often self-inflicted. Mobile 

phones trace where we go and with whom we communicate. 
Smartwatches measure heart rates and will soon start logging hap- 
piness and anger. The resulting data are streamed over vulnerable 
networks to commercial servers; they may be used by advertising 
companies or shared on social networks. 

Current data-protection laws are not prepared for this new reality. 
Conceptualized in the 1970s and 80s, they were designed for a society 
that perceived official government databases as the main privacy risk. 
Their focus on centralization, parsimony and secrecy clashes with 
today’s reality of ubiquitous personal data, deliberate sharing in social 
networks and all-too-frequent data leaks. 

We are quick to blame naive users and 
careless software developers when personal data 
are compromised, but the truth is that prudent 
individual behaviour provides little protection 
from networked surveillance. Even if I stop 
using my mobile phone to navigate the digital 
and physical world, I will still appear in the 
records of the people around me. 

Emerging technologies aggravate the 
situation. Camera drones watch us from above. 
Augmented-reality games such as Pokémon Go 
allow developers (or their sponsors) to control 
where we go in the real world. And handheld 
DNA sequencers will not only enable real- 
time monitoring of airborne pathogens (and 
exciting citizen-science projects), but also 
reveal our genetic data to anybody who can obtain our DNA. 

Large data sets as substrates for computer algorithms and machine- 
learning technology assist our daily lives — suggesting where to eat, 
which book to read and how to stay healthy. But they can be used against 
us, for example by predicting credit risk or the likelihood of committing 
acrime. Such predictions can be remarkably accurate, but they struggle 
with unusual behaviour and often discriminate against minorities. 
This emergent discrimination is difficult to avoid because it is rarely 
hard-coded into the algorithms but arises from biased training data. 
People might start to ‘act mainstreany just to be on the safe side — 
certainly not desirable for a pluralistic society. 

So how can we mitigate the inherent risks that ‘big data pose for 
personal freedom, as billions of connected devices churn out personal 
data, and data protection by secrecy has become an illusion? 

We must remember that data protection is a means to an end, rather 
than a goal in itself. We do not protect data because the data would take 
harm; rather, we seek to protect the rights and well-being of individuals 
who might be harmed by certain uses of their data. This observa- 
tion could hold the key to protecting personal freedom in a world of 
evaporating privacy. Finding ways to tame harmful uses of personal 


S urveillance is no longer the prerogative of government agencies. 
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Preserve personal freedom 
in networked societies 


Broad anti-discrimination laws and practices could compensate for failing 
data protection and technology -linked loss of privacy, says Christoph Bock. 


data would make future data leaks and unguarded data sharing less of 
a threat. We can distinguish between essentially financial risks, defined 
by damages that could be fully compensated through (potentially large) 
financial payments, and social risks, which affect interpersonal relation- 
ships in a way that cannot be reduced to monetary transactions. 

Financial risks include higher health-insurance premiums due to 
genetic risk factors, or waiting longer in a service hotline because the 
address or a prediction algorithm indicates a low-value customer. 
Strong anti-discrimination and consumer-protection laws can 
mitigate these risks, especially when combined with protection for 
whistle-blowers who uncover violations, and hardship funds that 
provide compensation when a perpetrator cannot pay. 

Social risks include shaming by friends and 
family over compromising video footage, or attacks 
over a personal opinion that has become public. 
Social risks are hard to tackle by legislation, as indi- 
viduals are unlikely to sue family members for fair 
and equal treatment. Nevertheless, anti-discrimi- 
nation laws help mitigate social risks by sending an 
authoritative message that certain types of dis- 
crimination are inappropriate, creating a spillover 
effect into aspects of our everyday lives not 
normally controlled by laws and litigation. 

Strong anti-discrimination laws thus emerge 
as a cornerstone of personal freedom when 
data protection fails and secrecy is compro- 
mised by ubiquitous data sharing. The Euro- 
pean Union's Charter of Fundamental Rights 
shows that such protection is legally and politi- 
cally achievable, prohibiting discrimination by “sex, race, colour, 
ethnic or social origin, genetic features, language, religion or belief, 
political or any other opinion, membership of a national minority, 
property, birth, disability, age or sexual orientation”. The Canadian 
Human Rights Act also provides relatively broad protection. But 
the situation is much more fragmented in the United States, and 
insufficient in China, Japan and large parts of the developing world. 

Scientists can contribute to ensuring that the loss of privacy through 
technology does not result in loss of personal freedom. First, they 
can credibly assess current and future privacy risks of new technolo- 
gies and stress the need to move beyond the unsustainable concept of 
data protection by secrecy. Second, they should advocate for robust 
legal protection against discrimination around the world. Third, they 
should educate, advise and monitor, to make sure that facts — not 
fears — dominate the political debate. m 


Christoph Bock is a principal investigator at the COMM Research 
Center for Molecular Medicine of the Austrian Academy of Sciences in 
Vienna. 

e-mail: cbock@cemm.oeaw.ac.at 


1 SEPTEMBER 2016 | VOL 537 | NATURE | 9 


© 2016 Macmillan Publishers Limited, part of Springer Nature. All rights reserved. 


Selections from the 
scientific literature 


RESEARCH HIGHLIGHTS 


Diet restriction 
makes fat brown 


Very low-calorie diets — 
shown to boost longevity in 
some mammals — can turn 
white, energy-storing fat into 
beige, energy-burning fat in 
mice, 

Calorie restriction and the 
accumulation of beige and 
brown fat have both been 
associated with metabolic 
benefits such as increased 
sensitivity to insulin. To 
look for a link between the 
two, Mirko Trajkovski of 
the University of Geneva in 
Switzerland and his colleagues 
cut the calories given to 
normal-weight and obese 
mice by 40% and found that 
this triggered the browning of 
white fat into beige fat in both 
types of animal. 

The restricted diet also 
raised the levels of certain 
immune-system proteins 
called cytokines. Mice that 
were genetically engineered 
to lack responses to these 
cytokines did not turn fat 
beige in response to caloric 
restriction — and also did 
not experience many of the 
metabolic benefits. 

Cell Metab. http://dx.doi. 
org/10.1016/j.cmet.2016.07.023 
(2016) 


| ___NEUROSCIENCE 
Memory trick 
dampens phobia 


Recalling fearful memories 
shortly before receiving 
psychological therapy could 
help people to diminish long- 
held fears. 

Once retrieved, a memory 
can be disrupted before it is 
reconsolidated — returned 
to long-term storage in the 
brain. Ina study of people 
with a lifelong fear of spiders, 
Johannes Bjérkstrand and 


HYDROLOGY 


South Asia water supplies at risk 


Groundwater supplies in northern India, 
Pakistan, Nepal and Bangladesh could be more 
endangered by contamination than by depletion. 
The Indo-Gangetic Basin includes the Indus, 
Ganges and Brahmaputra river systems and is 
one of the world’s most heavily used freshwater 
reservoirs. Previous low-resolution satellite 
data suggested that current exploitation rates 
are unsustainable. To study the region in greater 
detail, Alan MacDonald at the British Geological 
Survey in Edinburgh and his colleagues 
examined records from nearly 3,500 water 


his colleagues at Uppsala 
University in Sweden presented 
volunteers with pictures of 
spiders to activate their fear 
memory. They then performed 
exposure therapy (repeatedly 
showing pictures of spiders) 
either 10 minutes later, during 
memory reconsolidation, 
or six hours later, after 
reconsolidation had finished. 
In acomparison of the two 
groups the following day, those 
who were treated at 10 minutes 
showed reduced activation in 
the amygdala — a brain area 
that mediates fear — while 
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(2016) 


viewing pictures of spiders. 
They were also more likely to 
choose to view a picture of a 
spider in exchange for money. 
Curr. Biol. http://dx.doi. 
org/10.1016/j.cub.2016.08.022 
(2016) 


Exotic pentaquark 
confirmed 


After multiple false 
detections, physicists have 
now confirmed in a pair 
of studies the existence of 


wells and other high-resolution data to estimate 
groundwater levels and quality within the top 
200 metres of the aquifer. The team found that 
60% of the system was plagued with high levels 
of salt, arsenic and other pollutants. But across 
70% of the aquifer, the water table has been 
stable, or has even risen, from 2000 to 2012. 
Groundwater quality should be monitored 
to provide data for policymakers, the authors 


Nature Geosci. http://dx.doi.org/10.1038/ngeo2791 


subatomic particles known as 
‘pentaquarks. 

In the standard model, 
particles called baryons, which 
make up most of the visible 
matter in the Universe and 
include protons and neutrons, 
are built from three fractionally 
charged objects called quarks. 
Theorists have predicted 
that quarks could aggregate 
into larger groups and have 
speculated for years about 
the short-lived pentaquark, 
composed of four quarks and 
an antiquark. Now researchers 
at the LHCb experiment at 


JEREMY HORNER/CORBIS/VCG/GETTY 


NATURE COMMUN. 


GEORGETTE DOUWMA/NPL 


CERN’s Large Hadron Collider 
near Geneva, Switzerland, 
have come up with the most 
convincing evidence yet for 
this exotic particle. 

In one study, the authors 
reanalysed previous particle- 
decay data while reducing 
their model's assumptions. 
They showed at extremely 
high statistical significance 
that pentaquarks are needed to 
explain the data. In the second 
study, the researchers examined 
data from a particular kind of 
decay, finding that they are in 
line with predictions of decays 
involving pentaquarks. 

Phys. Rev. Lett. http://doi.org/ 
bpsc; http://doi.org/bpsb (2016) 


Cuttlefish can 
count 


Cuttlefish seem to be able to 
distinguish between large and 
small numbers, at least when it 
comes to food. 

Tsang-I Yang and Chuan- 
Chin Chiao at National Tsing 
Hua University in Hsinchu, 
Taiwan, let pharaoh cuttlefish 
(Sepia pharaonis; pictured) 
in the lab choose between two 
chambers containing different 
numbers of shrimps to eat. The 
animals consistently selected 
the chamber with more 
shrimps, regardless of whether 
there was a large or small 
difference in prey numbers. 
The cuttlefish also opted for 
two shrimps that were smaller 
and easier to eat than one 
large shrimp. But if they were 
hungry, they took the bigger 
and trickier meal. 

This shows that cuttlefish 
have a number sense, and 


that their choice of prey is 
motivated by both hunger 
and the size of the potential 
reward, the authors say. 
Proc. R. Soc. B 283, 20161379 
(2016) 


Rare mineral 
found on Earth 


Volcanic rocks from Israel 
contain the first known 
occurrence on Earth ofa 
titanium-rich mineral called 
tistarite. The discovery 
suggests that deep-Earth 
chemistry may differ from 
what scientists had suspected. 

Until now, tistarite had been 
found only in a single meteorite 
from Mexico. A team led by 
William Griffin at Macquarie 
University in Sydney, Australia, 
found more of it in rocks from 
Mount Carmel. 

Tistarite forms in chemically 
reducing conditions, for 
instance in high-hydrogen 
environments. The authors 
suggest that hydrogen or 
methane might percolate 
deep into volcanic plumbing 
systems, creating ultra- 
reducing pockets in which the 
unusual mineral can form. 
Geology http://dx.doi. 
org/10.1130/G37910.1 (2016) 


INFECTION 


Effects of sexually 
spread Zika 


Vaginal infection of pregnant 
mice by the Zika virus can 
cause growth restriction, brain 
infection and death of the fetus. 
Some people have been 
infected by the Zika virus 
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Human footprint change 
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through sexual activity rather 
than from mosquito bites. To 
study the effects of the virus 
after sexual transmission, a 
team led by Akiko Iwasaki 
at the Yale University School 
of Medicine in New Haven, 
Connecticut, developed a 
mouse model for vaginal 
transmission of the virus. 
They found that this mode of 
infection caused pathology in 
the fetuses in immunologically 
normal mothers; previous 
studies had suggested that Zika 
could not sustain long-lived 
infections in such animals 
when injected into the skin. 
The results implicate the 
female genital tract as a 
particularly vulnerable site for 
Zika infection. 
Cell http://dx.doi.org/10.1016/j. 
cell.2016.08.004 (2016) 


Uneven growth of 
human footprint 


The human footprint on the 
global environment increased 
by just 9% from 1993 to 2009, 
even though the world’s 
population grew by 23% and 
the economy by 153% during 
that period. However, this 
varied by region. 

A previous study had looked 
at humanity’s impacts on the 
terrestrial globe, using satellite 
and survey data from 1993 to 
quantify built environments, 
agricultural land, population 
density and other variables. 

To update the work, Oscar 
Venter at the University of 
Northern British Columbia in 
Prince George, Canada, and 
his colleagues compared those 
numbers with 2009 data. 

They found that areas 
with the highest levels of 


biodiversity, including many 
tropical areas, showed the 
fastest growth of the human 
footprint (pictured, in red and 
orange). Wealthy nations and 
those with strong control of 
corruption and high rates of 
urbanization showed the least 
growth in impacts (green). 


Nature Commun. 7,12558 (2016); 
Sci. Data 3, 160067 (2016) 


| NEUROSCIENCE 
Protein controls 
brain’s thermostat 


A heat-sensitive protein in 
the brain helps to detect and 
regulate body temperature in 
mice. 

Previous research had 
suggested that the ion channel 
TRPM2, which allows ions to 
pass across cell membranes, 
is involved in sensing warm 
temperatures. Now Jan Siemens 
at the University of Heidelberg 
in Germany and his colleagues 
report that the channel is 
expressed in a part of the 
hypothalamus, a brain region 
that helps to control body 
temperature. When injected 
with a molecule that triggers 
fever, mice lacking TRPM2 had 
higher body temperatures than 
control animals. The team also 
found that activating TRPM2- 
expressing neurons decreased. 
body temperature, whereas 
inhibiting those neurons 
increased it. 

The authors suggest that 
TRPM2 helps the brain to 
limit the severity of a fever. 
Science http://doi.org/bpzg 
(2016) 
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POLICY 


Union rights 

AUS national labour board 
has ruled that graduate 
students in the United 

States who work as teaching 
and research assistants at 
private universities must be 
recognized as employees, 

and therefore have a right to 
unionize. Graduate-student 
unions are already common 

at public institutions. The 

23 August ruling relates to 

a case involving a group 

of students at Columbia 
University in New York City 
who have struggled to get their 
union recognized. There has 
been debate in recent years 
over the rights of graduate 
students, many of whom teach 
courses while completing their 
degrees. 


Zika blood scans 
The US Food and Drug 
Administration (FDA) 
advised US blood banks on 
26 August to test all blood 
donations for Zika virus, in 
light of the virus’s spread in 
the United States (see page 7). 
Thousands of US travellers 
have been infected with 

Zika virus, but since July, 

29 people in south Florida 
have contracted it locally 
through mosquitoes, and the 
virus is expected to spread 


NUMBER CRUNCH 
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The latest estimate of the 
clean-up cost of a 2014 
accident at a New Mexico 
underground nuclear-waste 
repository. The sum would 
make the nuclear accident, 
in which a drum containing 
radioactive waste blew up, 
the costliest in US history. 


Source: Los Angeles Times 


Obama creates largest marine park 


US President Barack Obama announced 
the creation of the world’s largest marine 
protected area on 26 August, with a huge 
expansion of the Papahanaumokuakea park 
in the northwest of the Hawaiian Islands. 
The move will take the park from its current 


to other states. Previously, 

the FDA recommended 

Zika blood screening only in 
states affected by the virus. 
Separately, Singapore has 
reported its first small cluster 
of locally transmitted cases. It 
joins Vietnam, Thailand and 
the Philippines as countries in 
southeast Asia that have also 
reported their first sporadic 
transmissions of the virus this 
year. 


Child-health chief 


Medical geneticist Diana 
Bianchi will be the new 

head of the US National 
Institute of Child Health and 
Development (NICHD), 

the National Institutes of 
Health (NIH) announced on 
25 August. She replaces Alan 
Guttmacher, who retired in 
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September 2015. As director, 
Bianchi will oversee the 
NICHD’s US$1.3-billion 
annual budget, which 
includes the Human Placenta 
Project and participation 

in a new NIH longitudinal 
study called Environmental 
Influences on Child Health 
Outcomes. Bianchi, who 
studies prenatal diagnostics, 
will take the helm on 

31 October. 


Iranian physicist 
Omid Kokabee, a physicist 
who has been imprisoned 

in Iran for five years on an 
espionage conviction, has been 
granted freedom on parole, 

his lawyer said on 29 August. 
Kokabee, 34, was working on 
his PhD when he was jailed 

in Tehran in 2011. In April 
this year, he was moved to 
hospital to have kidney-cancer 


size of around 360,000 square kilometres to 
1.5 million square kilometres. The area is 
home to wildlife including whales, corals, 
millions of seabirds and the endangered 
Hawaiian monk seal (Neomonachus 
schauinslandi, pictured). 


surgery. He was then granted 
temporary medical leave 

and released after his friends 
posted bail. Kokabee has 
maintained his innocence and 
said that he was persecuted for 
refusing to work ona military 
nuclear programme in Iran. 
See go.nature.com/2cb5ab0 
for more. 


Physicist dies 

US particle physicist James 
Cronin died on 25 August, 
aged 84. In 1964, with 
colleague Val Fitch and 
their collaborators, Cronin 
discovered anomalies in the 
decay of kaon particles in 

an accelerator experiment 

at the Brookhaven National 
Laboratory in New York. The 
anomalies revealed a subtle 
asymmetry between matter 
and antimatter known as CP 
violation. Cronin and Fitch 
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SOURCE: K. R. SMITH ET AL. LANCET 388, 642-644 (2016) 


received a Nobel prize for 
their discovery in 1980. In 
the 1990s, Cronin became 

a driving force behind the 
Pierre Auger Observatory 
in Malargiie, Argentina, the 
largest cosmic-ray facility in 
the world, completed in 2004. 
Cronin was in the faculty of 
the University of Chicago in 
Illinois. 


Italy earthquake 


A 6.2-magnitude earthquake 
struck central Italy in the 

early hours of 24 August, 
killing some 290 people and 
devastating towns in the 
Apennine mountains. The 
quake struck 40 kilometres 
from LAquila, where a 

similar event killed around 
300 people in 2009. The region 
is tectonically complex, and 
seismologists had expected 
arupture to occur there 

at any time. More than 

900 aftershocks occurred, 
impeding recovery efforts. See 
page 15 for more. 


Airlander nosedive 


The world’s largest aircraft, 
which had a successful 
maiden flight in mid-August, 
has crash-landed on its second 
attempt. The 92-metre-long 
Airlander 10, which combines 
aeroplane and airship 
technology, nosedived on 
landing after the 100-minute 
test flight in Bedfordshire, 


TREND WATCH 


By 2085, most cities will be too 
hot to host the summer Olympics, 
according to an analysis in The 
Lancet (K. R. Smith et al. Lancet 
388, 642-644; 2016). Using 
climate modelling and a measure 
of heat stress, researchers judged 
the suitability of cities on the basis 
of whether conditions would be 
safe to run a marathon. Looking 
at the Northern Hemisphere, they 
found 25 cities in western Europe 
— and just 8 elsewhere — where 
temperatures were likely to be less 
than 26°C in the shade, defined as 
low risk for marathon running. 


UK, on 24 August (pictured). 
The cockpit of the craft was 
damaged, but nobody was 


what is required to reduce 
carbon emissions and coal 
exports. “There is no Planet B? 


injured, said the Airlander’s the scientists wrote. 

developer Hybrid Air Vehicles 

of Bedford. Airlander 10 | ——sSRESEARCH 
is intended for use in . 
surveillance, communication, Leprosy vaccine 

aid delivery and even India is to begin testing the 
passenger travel. world’s first vaccine that 


‘No Planet B’ 

More than 150 Australian 
scientists sent an open letter 
on 24 August to the country’s 
prime minister, Malcolm 
Turnbull, urging action on 
global warming. The 2015 
Paris climate agreement 
remains unbinding, and the 
world’s governments are 
“presiding over a large-scale 
demise of the planetary 
ecosystems’, the scientists 
wrote. Citing Turnbull’s 2010 
statement that humanity has 
an obligation to the planet, 
the scientists called on the 
Australian government to do 


exclusively targets leprosy. The 
disease, which is caused by 

the bacterium Mycobacterium 
leprae, newly affects 

125,000 people in India each 
year — 60% of global new 
cases. The vaccine, developed 
in India, has been approved by 
the country’s drug-regulation 
agency as well as the US Food 
and Drug Administration. 
According to media reports, 
tests will begin in a few weeks 
in five districts in Bihar and 
Gujarat, treating people who 
live in close contact with 
infected individuals. Trials have 
shown that infections could be 
reduced by 60% in 3 years. 


CLIMATE CHANGE VERSUS THE SUMMER OLYMPICS 


Most cities might be too hot to host a summer Games after 
2085; western European cities may be the most suitable. 
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SEVEN DAYS | THIS WEEK | 


4-7 SEPTEMBER 
Researchers gather at the 
10th Vaccine Congress 
in Amsterdam. 
www.vaccinecongress.com 


6-9 SEPTEMBER 
Enthusiasts head to the 
British Science Festival 
for activities and talks. 
britishsciencefestival.org 


China set for Mars 
The China National Space 
Administration is moving 
ahead with plans to senda 
rover to Mars in 2020. On 

23 August, officials unveiled 
details of the lander, which 
will explore a low-latitude 
area in Mars’s northern 
hemisphere. The six-wheeled 
probe, to be named by a 
public contest, is designed to 
operate for at least 6 months; 
its 13 payloads will include 

a ground-penetrating radar 
to study rock layers. Other 
agencies aiming to send rovers 
to Mars during the 2020 
launch opportunity include 
NASA and the European 
Space Agency. 


Robo-taxi trial 
Technology company 
nuTonomy said on 25 August 
that it will start trials of self- 
driving taxis in Singapore, 
in which customers will be 
able to request a ride using a 
smartphone app. Engineers 
from the company, which 

is based in Cambridge, 
Massachusetts, and 
Singapore, will ride in the 
car, ready to take the wheel 
as needed. The joint project 
with the Singapore Land 
Transport Authority aims to 
launch a fully autonomous 
taxi service by 2018. US ride- 
hailing company Uber and 
carmaker Volvo have said 
that they are starting 

similar trials in Pittsburgh, 
Pennsylvania. 
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SEISMOLOGY 


The town of Amatrice in central Italy has been devastated by the earthquake on 24 August. 


g 


Italian scientists shocked by 
earthquake devastation 


Inaregion known to be seismically active, destruction on this scale was still a surprise. 


BY ALISON ABBOTT AND 
QUIRIN SCHIERMEIER 


devastating 6.2-magnitude earthquake 
A: central Italy on 24 August that killed 

more than 290 people was the coun- 
try’s largest since a magnitude-6.3 earthquake 
in 2009 that hit the town of LAquila, about 
40 kilometres away. That event killed 308 peo- 
ple, destroyed tens of thousands of homes and 
a university. Controversially, it also caused six 


scientists to be put on trial for manslaughter. 
Central Italy’s complex geological and tec- 
tonic make-up creates a notorious quake risk. 
The Adria micro-plate dives beneath the Apen- 
nine mountain range from east to west, creating 
seismic strain. The mighty Eurasian and African 
plates also collide here, with the Eurasian plate 
moving northeast at 24 millimetres per year. 
The latest quake also injured hundreds and 
laid waste to historic villages in the Apen- 
nine mountains, including Amatrice (see 


‘Epicentre ofa quake). It was a result of increased 
horizontal stress perpendicular to the moun- 
tain chain. 

Seismologists had expected a rupture to 
occur near the location at any time. Still, 
Giulio Selvaggi, a research director at the 
National Institute of Geophysics and Volcan- 
ology in Rome, and one of those initially con- 
victed of manslaughter — all six were cleared 
on appeal — says he was shocked by the death 
and destruction wreaked by last week's 
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> quake. The mountainous region around 
Amatrice is sparsely populated, but the final 
death toll may exceed that of more populated 
and urbanized L Aquila. 

Selvaggi seconds a public outcry over the 
failure of authorities to prioritize making 
old buildings more earthquake-resistant and 
notes that his team supplies earthquake maps 
to them. “We scientists have made a beautiful, 
detailed seismic hazard map, showing clearly 
the areas in greatest need of preventive meas- 
ures,’ he says. “But public authorities don't take 
enough action” 

The court case over the L Aquila earthquake 
came about because a local amateur researcher 
claimed to have evidence of an imminent, large 
quake. Six scientists and one government offi- 
cial who had publicly dismissed the amateur’s 
methods were accused of misinforming the 
public. Following an unprecedented trial, all 


EPICENTRE OF A QUAKE 


The earthquake is the strongest in Italy since the 
magnitude-6.3 event in 2009 near L'Aquila. 
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seven were given six-year jail sentences for 
manslaughter, but the scientists were cleared 
on appeal in 2014. 


Computer scientist Paola Inverardi, who is 
rector of the university in LAquila, says the 
rebuilding of the university is nearly com- 
plete, and that research activities had resumed 
by 2012. Science in the region has also benefited 
from supporting initiatives following the quake, 
she says. One of these is the Gran Sasso Science 
Institute, an international graduate school 
founded in 2012 to inject young intellectual life 
into L Aquila. It has been so successful that in 
June it was awarded university status. 

Unlike the earthquake in L Aquila, which 
was preceded by frequent, mostly low-mag- 
nitude, tremors in the surrounding area, no 
seismic activity was recorded before the latest 
earthquake. “It came out of the blue, without 
the preceding tremors we experienced in ‘our’ 
earthquake,” says Inverardi. L Aquila itself 
experienced virtually no damage, but, she says, 
“psychologically we were all pushed back”. = 


Nuclear power plants 


prepare for old age 


Efforts are afoot to keep the world’s reactors running well past 2050. 


BY JEFF TOLLEFSON 


ophisticated inspections are helping to 

pick up defects in ageing nuclear power 

plants before they cause trouble. In 
March, ultrasonic tests identified signs of wear 
and tear in some of the stainless-steel bolts in 
the reactor core of the Indian Point power plant 
just north of New York City. Researchers at the 
Electric Power Research Institute (EPRI) in 
Palo Alto, California, are now analysing more 
than a dozen of the 5-centimetre-long bolts 
— which secure plates that help direct water 


GOING, GOING, GONE 


through the radioactive core — to determine 
why they failed the inspection. 

The analysis comes as the US Nuclear 
Regulatory Commission (NRC) considers 
whether to extend the life of Indian Point's two 
40-year-old reactors for 20 more years. Oppo- 
nents of the plant, including the state of New 
York, cite the defective bolts, a transformer fire 
last year and environmental and safety con- 
cerns as evidence that the facility should close. 

The plant’s damaged bolts are just one 
example of the maintenance issues facing age- 
ing nuclear reactors around the world. The 


Nuclear power accounts for 20% of US electricity generation, but few new reactors are being built. 
The following shows the total projected energy output of these plants under different licence-renewal plans. 
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International Atomic Energy Agency and the 
NRC are developing management guidelines 
for these facilities, but the problem may be 
most acute for the United States, whose fleet of 
99 reactors is the oldest and largest. 

The NRC has renewed the licences of 81 US 
reactors still in operation for another 20 years. 
Andit presented safety guidelines in December 
for utilities considering renewing their licences 
for another 20 years. But concerns remain about 
the effects of time on facilities that could be in 
operation for 80 years (see ‘Going, going, gone). 

Former NRC chair Allison Macfarlane says 
that the industry has been struggling economi- 
cally in the face of cheap natural gas, and that 
many nuclear power companies are investing 
the bare minimum when it comes to main- 
tenance and upgrades. She would rather see 
a transition to newer — and safer — reactor 
designs than attempts to push old ones to their 
limits. 


EXTENDING LIFETIMES 

Kurt Edsinger, director of materials at the 
EPRI, and his team will run a battery of tests on 
some of the Indian Point bolts to examine frac- 
tures and assess the strength of the material. 
They will also analyse the effects of roughly 
four decades of neutron bombardment on the 
crystalline structure of the steel in the bolts. 


SOURCE: NUCLEAR ENERGY INSTITUTE 
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Many US nuclear-reactor facilities are old. The Browns Ferry plant in Athens, Alabama, opened in 1974. 


The study is part of a larger effort by the 
EPRI and the US Department of Energy to 
inform the industry and regulators around the 
world about the risks regarding ageing materi- 
als and components as nuclear power plants 
come up for further licence renewals. 

“So far, there have been no generic show- 
stoppers identified that would preclude a second 
licence renewal,’ says Kathryn McCarthy, tech- 
nical director of the energy department's Light 
Water Reactor Sustainability Program. 

With few new reactors coming on line 
around the world, the longevity of existing 
facilities could have huge implications for 
the global climate. 


Nuclear plants cur- “Ifyou maintain 
rently provide 20% themand 

of the United States’ replace parts, 
electricity — and thereisno 

more than half ofits reason why 
low-carbon power. At nuclear plants 
the globallevel, only can’truna very 


hydropower provides 


long time.” 


more low-carbon 
power, at roughly 16% of total electricity pro- 
duced, compared with nearly 11% for nuclear. 

“If you maintain them and replace parts, 
there is no reason why nuclear plants can’t run 
avery long time, which is great news from a 
climate perspective,” says Michael Shellen- 
berger, president of the Environmental Pro- 
gress advocacy group in Berkeley, California. 

Others are less sanguine. Important ques- 
tions remain regarding the durability of parts 
that inspectors cannot see, such as under- 
ground power cables, as well as about how 
materials age, says Macfarlane. 

Of particular concern are the concrete 
containment structures and steel pressure 
vessels at the heart of reactors, as well as the 


kilometres of wires that snake through the 
plants. Researchers are now analysing the 
long-term effects of intense heat and neutron 
bombardment on a plant’s crucial materials 
down to the atomic level. 

In some cases, scientists conduct acceler- 
ated-ageing experiments, in which materials 
are intensely irradiated to simulate 80 years 
of activity inside a reactor. That information 
can then be plugged into models that project 
degradation. 


EARLY WARNING 

The NRC’s licence-renewal process focuses on 
crucial infrastructure that might not be part 
of regular maintenance programmes. The goal 
is to create an inspection system that detects 
defects before they become a problem, says 
Allen Hiser, a senior technical adviser in the 
NRC division that handles licence renewals. 

NRC officials say this is what happened at 
Indian Point; similar bolt defects were discov- 
ered in 1988 at a nuclear reactor in France, and 
the agency established inspection require- 
ments to detect such issues in the future. 

But that is not the whole story, says Dave 
Lochbaum, head of the nuclear-safety project 
at the Union of Concerned Scientists advo- 
cacy group in Cambridge, Massachusetts. The 
ultrasonic inspection that identified the dam- 
aged bolts at Indian Point — a technique that 
is now mandatory — came about only after the 
state of New York challenged the adequacy of 
visual inspections nearly a decade ago, he says. 

Macfarlane remains sceptical. If the licences 
for current US plants are renewed for a second 
time, the facilities will live to be 80 years old, 
with nearly 100-year-old designs, she says. 
“We would be much better off with some of 
the newer reactors.” m 
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Print your own 3D hominin 
to work out how Lucy died 


Digital scans will help to test whether the famous australopithecine fell out of a tree. 


BY EWEN CALLAWAY 


he world’s most famous fossil is now 
"Tove source. 3D scans of Lucy — a 

3.18-million-year-old hominin found 
in Ethiopia — were released on 29 August, 
allowing anyone to examine her arm, shoul- 
der and knee bones and even make their own 
3D-printed copies. 

The scans accompany a Nature paper that 
argues that Lucy, a human relative belonging 
to the species Australopithecus afarensis, died 
after falling from a tree (J. Kappelman et al. 
Nature http://dx.doi.org/10.1038/nature19332; 
2016). The team behind the paper also made 
the scans available to the public and is eager 
for other researchers to test the hypothesis by 
printing out the bones. 

“Tt’s one thing for me to describe it in detail 
in paper, but it’s another thing to hold these 
things, to be able to print them out, look at 
them and put them together,’ says team leader 
John Kappelman, a palaeoanthropologist at the 
University of Texas at Austin. 

His team received approval from the National 
Museum of Ethiopia and the country’s govern- 
ment to make the models of Lucy public. “My 
sense from the Ethiopians is that Lucy is not 
only their national treasure, but they see her as a 
treasure for humankind,’ says Kappelman, who 
hopes that the country will soon release digital 
scans of the rest of Lucy and that other countries 
may follow suit with other hominin fossils. 

“Coming from Ethiopia, it really is a posi- 
tive step, because other countries that are 
hesitant may be willing to do the same thing,” 
says Louise Leakey, a palaeontologist at Stony 
Brook University in New York. 

But Kappelman and others say that such a 
move could threaten cash-strapped museums 
— many of them in Africa — that rely on 
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Lucy’s arm bone undergoes a computed-tomography scan. 


income generated from casts of their fossil 
collections to help them survive. 

Lucy’s digital debut was eight years in the 
making. Her 40%-complete remains spent 10 
days in Kappelmanis lab in August 2008 during a 
US tour. His team worked day and night to scan 
every one of several hundred bone fragments 
using a computed-tomography (CT) imager. 

Close examination revealed unusual 
fractures: the end of her right humerus that 
connected to her shoulder had a series of clean 
breaks and compressions similar to those that 
orthopaedic surgeons often see in people who 
attempt to break a fall with an outstretched 
arm. Damage to Lucy’s pelvis, left shoulder and 
knee and right ankle was also consistent with 
a fall from a great height. Kappelman’s team 
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Floods, @ Academis warn of universities on 
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estimates that Lucy fell from a tree taller than 
10 metres and died from her injuries, reach- 
ing a speed of up to 60 kilometres per hour 
at impact. 


ARBOREAL ORIGINS 

It’s unclear how suited Lucy was to arboreal 
life. She walked upright, but she may have held 
onto adaptations that helped her ancestors cope 
in trees — although that idea is hotly debated. 
Kappelman’s team proposes that Lucy would 
have slept in trees to avoid predators, yet was 
not as adroit there as her more-ape-like ances- 
tors. “Here’s the most famous fossil on the 
planet, the centre of the debate over arboreal- 
ism in human evolution, and we think it’s most 
likely she died from a fall out of tree,’ he says. > 
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promise in treating 
Alzheimer’s nature. 
com/nature/podcast 
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> But Marc Meyer, a palaeoanthropologist 
at Chaffey College in Rancho Cucamonga, 
California, who recently examined Lucy 
in Addis Ababa, is sceptical. Chimpanzees 
tend to break their spines when they fall 
from trees, says Meyer, and “Lucy’s spine 
does not come close to the amount of dam- 
age we would expect to see in a fatal fall”. 

Lucy’s discoverers noticed her broken 
bones when they found her, but proposed 
that this had occurred after she died. Don- 
ald Johanson, the palaeoanthropologist at 
Arizona State University in Tempe who 
found Lucy in 1974, still stands by that inter- 
pretation. Broken bones such as Lucy’s are 
common in other nearby remains, he notes. 

Kappelman is keen for others to test their 
theory. Digital models of portions of Lucy's 
left knee and right shoulder and arm are 
available at eLucy.org. 

But although printed bones and virtual 
models can be helpful, Meyer says there 
is no substitute for seeing a fossil in per- 
son. He found stark differences between 
Ardipithecus ramidus, a 4.4-million-year- 
old hominin also found in Ethiopia, and 
a physical cast that he studied, including 
several deformities not captured in the cast. 


DIGITAL DOWNLOADS 
Digital models of hominin fossils are rare, 
but a few are available. About 100 of the 
1,500 remains ascribed to Homo naledi, 
uncovered in 2013 ina South African cave 
system, can be downloaded at Morpho- 
Source.org, as can models of the 2-million- 
year-old Australopithecus sediba found by 
the same team in 2008. 
AfricanFossils.org, which distributes 
digital models of hominin fossils for educa- 
tion and is headed 


“The days by Leakey, con- 
of keeping tains numerous 
this content important speci- 
squirrelled mens from Kenya. 


But the website's 
models, although 
sufficient for 3D printing in many cases, are 
purposefully low in resolution, so as not to 
cut into income generated from making 
physical replicas. 

Kappelman would like to see such 
revenue streams maintained, for instance 
by making lower-quality models free while 
charging researchers for good digital repro- 
ductions. “What has to be done is to put 
together a good business model that allows 
these museums to be able to have some sort 
of revenue stream off of these data,’ he says. 

Leakey, however, thinks that charging 
researchers will further limit access. She 
also points out that digital models can easily 
be pirated. “The days of keeping this con- 
tent squirrelled away are gone,” she says. 
“Once you make a 3D model available, to 
control it is impossible? m= 


away are gone.” 
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The “family trees’ 
of mathematics 


Academic relationships hint at science, and world, history. 


BY DAVIDE CASTELVECCHI 


ost of the world’s mathematicians fall 
Me just 24 scientific ‘families, one 

of which dates back to the fifteenth 
century. The insight comes from an analysis of 
the Mathematics Genealogy Project (MGP), 
which aims to connect all mathematicians, liv- 
ing and dead, into family trees on the basis of 
teacher—pupil lineages, in particular who an 
individual’s doctoral adviser was’. 

The analysis also uses the MGP — the most 
complete such project — to trace trends in the 
history of science, including the emergence of 
the United States as a scientific power in the 
1920s and the rise to dominance of different 
mathematical subfields. 

“You can see how mathematics has evolved 
in time,” says Floriana Gargiulo, who studies 
networks dynamics at the University of Namur, 
Belgium, and who led the analysis. 

The MGP is hosted by North Dakota State 
University in Fargo and co-sponsored by 
the American Mathematical Society. Since the 
early 1990s, its organizers have mined infor- 
mation from university departments and from 
individuals who make submissions regarding 
themselves or people they know about. As 
of 25 August, the MGP contained 201,618 
entries. As well as doctoral advisers and 
pupils of mathematicians, the MGP contains 
details such as the university that awarded the 
doctorate. 

Previously, researchers had used the MGP to 
reconstruct their own PhD-family trees, or 
to see how many ‘descendants’ a researcher has. 
Gargiulo’s team wanted to make a comprehen- 
sive analysis of the entire database and divide it 
into distinct families, rather than just looking 
at how many descendants any one person has. 

After downloading the database, Gargiulo 
and her colleagues wrote machine-learning 
algorithms that cross-checked and comple- 
mented the MGP data with information from 
Wikipedia and from scientists’ profiles in the 
Scopus bibliographic database. 

This revealed 84 distinct family trees with 
two-thirds of the world’s mathematicians con- 
centrated in just 24 of them. The high degree 
of clustering arises in part because the algo- 
rithms assigned each mathematician just one 
academic parent: when an individual had more 
than one adviser, they were assigned the one 


2016 


with the bigger network. But the phenomenon 
chimes with anecdotal reports from those who 
research their own mathematical ancestry, says 
MGP director Mitchel Keller, a mathematician 
at Washington and Lee University in Lexing- 
ton, Virginia. “Most of them run into Euler, or 
Gauss or some other big name,’ he says. 

Although the MGP is still somewhat 
US-centric, the goal is for it to become as inter- 
national as possible, Keller says. 

Peculiarly, the progenitor of the largest 
family tree is not a mathematician but a phy- 
sician: Sigismondo Polcastro, who taught 
medicine at the University of Padua in Italy 
in the early fifteenth century. He has 56,387 
descendants according to the analysis (see 
‘Mathematical clans’). The second-largest tree 
is one started by a Russian called Ivan Dolbnya 


MATHEMATICAL CLANS 


Two-thirds of mathematicians in the 
Mathematics Genealogy Project (MGP) belong 
to just 24 distinct academic families, according 
to an analysis that assigns ‘parenthood’ based 
on teacher-pupil relationships. 


Scholars who 

founded the 

mathematical 
families 


Sigismondo Polcastro 
56,387 followers 
1384-1473, Italy 


Ivan Petrovich Dolbnya 
18,968 followers 
1853-1912, Russia 


Jean Le Rond d'Alembert 
15,732 followers 
1717-1783, France 


Friedrich Leibniz 
10,039 followers 
1597-1652, Germany 


Henry Bracken 
8,178 followers 
1697-1764, England 


65% of mathematicians in the MGP 


Remaining 19 founders 
33,882 followers 
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> But Marc Meyer, a palaeoanthropologist 
at Chaffey College in Rancho Cucamonga, 
California, who recently examined Lucy 
in Addis Ababa, is sceptical. Chimpanzees 
tend to break their spines when they fall 
from trees, says Meyer, and “Lucy’s spine 
does not come close to the amount of dam- 
age we would expect to see in a fatal fall”. 

Lucy’s discoverers noticed her broken 
bones when they found her, but proposed 
that this had occurred after she died. Don- 
ald Johanson, the palaeoanthropologist at 
Arizona State University in Tempe who 
found Lucy in 1974, still stands by that inter- 
pretation. Broken bones such as Lucy’s are 
common in other nearby remains, he notes. 

Kappelman is keen for others to test their 
theory. Digital models of portions of Lucy's 
left knee and right shoulder and arm are 
available at eLucy.org. 

But although printed bones and virtual 
models can be helpful, Meyer says there 
is no substitute for seeing a fossil in per- 
son. He found stark differences between 
Ardipithecus ramidus, a 4.4-million-year- 
old hominin also found in Ethiopia, and 
a physical cast that he studied, including 
several deformities not captured in the cast. 


DIGITAL DOWNLOADS 
Digital models of hominin fossils are rare, 
but a few are available. About 100 of the 
1,500 remains ascribed to Homo naledi, 
uncovered in 2013 ina South African cave 
system, can be downloaded at Morpho- 
Source.org, as can models of the 2-million- 
year-old Australopithecus sediba found by 
the same team in 2008. 
AfricanFossils.org, which distributes 
digital models of hominin fossils for educa- 
tion and is headed 
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But the website's 
models, although 
sufficient for 3D printing in many cases, are 
purposefully low in resolution, so as not to 
cut into income generated from making 
physical replicas. 

Kappelman would like to see such 
revenue streams maintained, for instance 
by making lower-quality models free while 
charging researchers for good digital repro- 
ductions. “What has to be done is to put 
together a good business model that allows 
these museums to be able to have some sort 
of revenue stream off of these data,’ he says. 

Leakey, however, thinks that charging 
researchers will further limit access. She 
also points out that digital models can easily 
be pirated. “The days of keeping this con- 
tent squirrelled away are gone,” she says. 
“Once you make a 3D model available, to 
control it is impossible? m= 


away are gone.” 
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ost of the world’s mathematicians fall 
Me just 24 scientific ‘families, one 

of which dates back to the fifteenth 
century. The insight comes from an analysis of 
the Mathematics Genealogy Project (MGP), 
which aims to connect all mathematicians, liv- 
ing and dead, into family trees on the basis of 
teacher—pupil lineages, in particular who an 
individual’s doctoral adviser was’. 

The analysis also uses the MGP — the most 
complete such project — to trace trends in the 
history of science, including the emergence of 
the United States as a scientific power in the 
1920s and the rise to dominance of different 
mathematical subfields. 

“You can see how mathematics has evolved 
in time,” says Floriana Gargiulo, who studies 
networks dynamics at the University of Namur, 
Belgium, and who led the analysis. 

The MGP is hosted by North Dakota State 
University in Fargo and co-sponsored by 
the American Mathematical Society. Since the 
early 1990s, its organizers have mined infor- 
mation from university departments and from 
individuals who make submissions regarding 
themselves or people they know about. As 
of 25 August, the MGP contained 201,618 
entries. As well as doctoral advisers and 
pupils of mathematicians, the MGP contains 
details such as the university that awarded the 
doctorate. 

Previously, researchers had used the MGP to 
reconstruct their own PhD-family trees, or 
to see how many ‘descendants’ a researcher has. 
Gargiulo’s team wanted to make a comprehen- 
sive analysis of the entire database and divide it 
into distinct families, rather than just looking 
at how many descendants any one person has. 

After downloading the database, Gargiulo 
and her colleagues wrote machine-learning 
algorithms that cross-checked and comple- 
mented the MGP data with information from 
Wikipedia and from scientists’ profiles in the 
Scopus bibliographic database. 

This revealed 84 distinct family trees with 
two-thirds of the world’s mathematicians con- 
centrated in just 24 of them. The high degree 
of clustering arises in part because the algo- 
rithms assigned each mathematician just one 
academic parent: when an individual had more 
than one adviser, they were assigned the one 
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with the bigger network. But the phenomenon 
chimes with anecdotal reports from those who 
research their own mathematical ancestry, says 
MGP director Mitchel Keller, a mathematician 
at Washington and Lee University in Lexing- 
ton, Virginia. “Most of them run into Euler, or 
Gauss or some other big name,’ he says. 

Although the MGP is still somewhat 
US-centric, the goal is for it to become as inter- 
national as possible, Keller says. 

Peculiarly, the progenitor of the largest 
family tree is not a mathematician but a phy- 
sician: Sigismondo Polcastro, who taught 
medicine at the University of Padua in Italy 
in the early fifteenth century. He has 56,387 
descendants according to the analysis (see 
‘Mathematical clans’). The second-largest tree 
is one started by a Russian called Ivan Dolbnya 


MATHEMATICAL CLANS 


Two-thirds of mathematicians in the 
Mathematics Genealogy Project (MGP) belong 
to just 24 distinct academic families, according 
to an analysis that assigns ‘parenthood’ based 
on teacher-pupil relationships. 


Scholars who 

founded the 

mathematical 
families 


Sigismondo Polcastro 
56,387 followers 
1384-1473, Italy 


Ivan Petrovich Dolbnya 
18,968 followers 
1853-1912, Russia 


Jean Le Rond d'Alembert 
15,732 followers 
1717-1783, France 


Friedrich Leibniz 
10,039 followers 
1597-1652, Germany 


Henry Bracken 
8,178 followers 
1697-1764, England 


65% of mathematicians in the MGP 


Remaining 19 founders 
33,882 followers 
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in the late nineteenth century. 

The authors also tracked mathematical 
activity by country, which seemed to pinpoint 
major historical events. Around the time of 
the dissolution of the Austro-Hungarian 
Empire, there is a decline in mathematics 
PhDs awarded in the region, notes Gargiulo. 
Between 1920 and 1940, the United States 
took over from Germany as the country pro- 
ducing the largest number of mathematics 
PhDs each year (see ‘Mathematics in flux’). 
And the ascendancy of the Soviet Union is 
marked by a peak of PhDs in the 1960s, fol- 
lowed by a relative fall after the break-up of 
the union in 1991. 

Gargiulo’s team also looked at the 
dominance of mathematical subfields rela- 
tive to each other. The researchers found that 
dominance shifted from mathematical phys- 
ics to pure maths during the first half of the 
twentieth century, and later to statistics and 
other applied disciplines, such as computer 
science. 

Idiosyncrasies in the field of mathematics 
could explain why it has the most compre- 
hensive genealogy database of any discipline. 
“Mathematicians are a bit of a world apart; says 
Roberta Sinatra, a network and data scientist at 
Central European University in Budapest who 
led a 2015 study that mapped the evolution of 
the subdisciplines of physics by mining data 


MATHEMATICS IN FLUX 
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The proportion of mathematics PhDs produced by various countries 


changes over the centuries, tracing geopolitical trends. 
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from papers on the Web of Science’. 

Mathematicians tend to publish less than 
other researchers, and they establish their 
academic reputation not so much on how 
frequently they publish or on their number of 
citations, but on who they have collaborated 
with, including their mentors, she says. “I 
think it’s not a coincidence that they have this 
genealogy project.” 

At least one discipline is trying to catch 
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up. Historian of astronomy Joseph Tenn of 
Sonoma State University in California plans 
by 2017 to launch the AstroGen project to 
record the PhD advisers and students of 
astronomers. “I started it,” he says, “because so 
many of my colleagues in astronomy admired 
and enjoyed perusing the Mathematics 
Genealogy Project: = 


1. Gargiulo, F. et al. ERJ Data Sci. 5, 26 (2016). 
2. Sinatra, R. et al. Nature Phys. 11, 791-796 (2015). 
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COULD THE MOLECULE 
KNOWN FOR STORING 
GENETIC INFORMATION ALSO 
STORE THE WORLD’S DATA? 


or Nick Goldman, the idea of encoding data in 
DNA started out as a joke. 
It was Wednesday 16 February 2011, and 
Goldman was at a hotel in Hamburg, Germany, 
talking with some of his fellow bioinformaticists 
about how they could afford to store the reams of 
genome sequences and other data the world was 
throwing at them. He remembers the scientists 
getting so frustrated by the expense and limitations of conventional 
computing technology that they started kidding about sci-fi alterna- 
tives. “We thought, ‘What's to stop us using DNA to store information?” 

Then the laughter stopped. “It was a lightbulb moment,” says 
Goldman, a group leader at the European Bioinformatics Institute (EBI) 
in Hinxton, UK. True, DNA storage would be pathetically slow com- 
pared with the microsecond timescales for reading or writing bits in a 
silicon memory chip. It would take hours to encode data by synthesiz- 
ing DNA strings with a specific pattern of bases, and still more hours to 
recover that information using a sequencing machine. But with DNA, a 
whole human genome fits into a cell that is invisible to the naked eye. For 
sheer density of information storage, DNA could be orders of magnitude 
beyond silicon — perfect for long-term archiving. 

“We sat down in the bar with napkins and biros,” says Goldman, 
and started scribbling ideas: “What would you have to do to make that 
work?” The researchers’ biggest worry was that DNA synthesis and 
sequencing made mistakes as often as 1 in every 100 nucleotides. This 
would render large-scale data storage hopelessly unreliable — unless 
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they could find a workable error-correction scheme. Could they encode 
bits into base pairs in a way that would allow them to detect and undo 
the mistakes? “Within the course of an evening,’ says Goldman, “we 
knew that you could” 

He and his EBI colleague Ewan Birney took the idea back to their labs, 
and two years later announced that they had successfully used DNA 
to encode five files, including Shakespeare's sonnets and a snippet of 
Martin Luther King’s Ihave a dream speech’. By then, biologist George 
Church and his team at Harvard University in Cambridge, Massachu- 
setts, had unveiled an independent demonstration of DNA encoding’. 
But at 739 kilobases (kB), the EBI files comprised the largest DNA archive 
ever produced — until July 2016, when researchers from Microsoft and 
the University of Washington claimed a leap to 200 megabytes (MB). 

The latest experiment signals that interest in using DNA as a storage 
medium is surging far beyond genomics: the whole world is facing a data 
crunch. Counting everything from astronomical images and journal 
articles to YouTube videos, the global digital archive will hit an estimated 
44 trillion gigabytes (GB) by 2020, a tenfold increase over 2013. By 2040, 
if everything were stored for instant access in, say, the flash memory 
chips used in memory sticks, the archive would consume 10-100 times 
the expected supply of microchip-grade silicon’. 

That is one reason why permanent archives of rarely accessed data 
currently rely on old-fashioned magnetic tapes. This medium packs in 
information much more densely than silicon can, but is much slower 
to read. Yet even that approach is becoming unsustainable, says David 
Markowitz, a computational neuroscientist at the US Intelligence 
Advanced Research Projects Activity (IARPA) in Washington DC. 
It is possible to imagine a data centre holding an exabyte (one billion 
gigabytes) on tape drives, he says. But such a centre would require 
US$1 billion over 10 years to build and maintain, as well as hundreds 
of megawatts of power. “Molecular data storage has the potential to 
reduce all of those requirements by up to three orders of magnitude,” 
says Markowitz. If information could be packaged as densely as it is in 
the genes of the bacterium Escherichia coli, the world’s storage needs 
could be met by about a kilogram of DNA (see ‘Storage limits’). 

Achieving that potential won't be easy. Before DNA can become a 
viable competitor to conventional storage technologies, researchers 
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will have to surmount a host of challenges, from reliably encoding 
information in DNA and retrieving only the information a user 
needs, to making nucleotide strings cheaply and quickly enough. 

But efforts to meet those challenges are picking up. The 
Semiconductor Research Corporation (SRC), a foundation in Dur- 
ham, North Carolina, that is supported by a consortium of chip- 
, making firms, is backing DNA storage work. Goldman and Birney 
have UK government funding to experiment with next-generation 
approaches to DNA storage and are planning to set up a company 
to build on their research. And in April, IARPA and the SRC hosted 
a workshop for academics and industry researchers, including from 
companies such as IBM, to direct research in the field. 

“For ten years we've been looking beyond silicon” for data archiv- 
ing, says SRC director and chief scientist Victor Zhirnov. “It is very 
difficult to replace; he says. But DNA, one of the strongest candi- 
dates yet, “looks like it may happen” 


LONG-TERM MEMORY 

The first person to map the ones and zeroes of digital data onto the 
four base pairs of DNA was artist Joe Davis, ina 1988 collaboration 
with researchers from Harvard. The DNA sequence, which they 
inserted into E. coli, encoded just 35 bits. When organized into a5 x7 
matrix, with ones corresponding to dark pixels and zeroes corre- 
sponding to light pixels, they formed a picture of an ancient Germanic 
rune representing life and the female Earth. 

Today, Davis is affiliated with Church’s lab, which began to explore 
DNA data storage in 2011. The Harvard team hoped the application 
might help to reduce the high cost of synthesizing DNA, much as 
genomics had reduced the cost of sequencing. Church carried out the 
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proof-of-concept experiments in November 2011 along with Sri Kosuri, 
now at the University of California, Los Angeles, and genomics expert 
Yuan Gao at Johns Hopkins University in Baltimore, Maryland. The 
team used many short DNA strings to encode a 659-kB version ofa book 
Church had co-authored. Part of each string was an address that specified 
how the pieces should be ordered after sequencing, with the remainder 
containing the data. A binary zero could be encoded by the bases adenine 
or cytosine, and a binary one could be represented by guanine or thymine. 
That flexibility helped the group to design sequences that avoided reading 
problems, which can occur with regions containing lots of guanine and 
cytosine, repeated sections, or stretches that bind to one another and make 
the strings fold up. They didn't have error correction in the strict sense, 
instead relying on the redundancy provided by having many copies of 
each individual string. Consequently, after sequencing the strings, Kosuri, 
Church and Gao found 22 errors — far too many for reliable data storage. 

At the EBI, meanwhile, Goldman, Birney and their colleagues were 
also using many strings of DNA to encode their 739-kb data store, which 
included an image, ASCII text, audio files and a PDF version of Watson 
and Crick’s iconic paper on DNAs double-helix structure. To avoid repeat- 
ing bases and other sources of error, the EBI-led team used a more com- 
plex scheme (see ‘Making memories’). One aspect involved encoding the 
data not as binary ones and zeroes, but in base three — the equivalent 
of zero, one and two. They then continuously rotated which DNA base 
represented each number, so as to avoid sequences that might cause prob- 
lems during reading. By using overlapping, 100-base-long strings that 
progressively shifted by 25 bases, the EBI scientists also ensured that there 
would be four versions of each 25-base segment for error-checking and 
comparison against each other. 

They still lost 2 of the 25-base sequences — ironically, part of the Wat- 
son and Crick file. Nevertheless, these results convinced Goldman that 
DNA had potential as a cheap, long-term data repository that would 
require little energy to store. As a measure of just how long-term, he 
points to the 2013 announcement of a horse genome decoded from a 
bone trapped in permafrost for 700,000 years*. “In data centres, no one 
trusts a hard disk after three years,” he says. “No one trusts a tape after at 
most ten years. Where you want a copy safe for more than that, once we 
can get those written on DNA, you can stick it in a cave and forget about 
it until you want to read it” 


A BURGEONING FIELD 

That possibility has captured the imaginations of computer scientists Luis 
Ceze, from the University of Washington, and Karin Strauss, from Micro- 
soft Research in Redmond, Washington, ever since they heard Goldman 
discuss the EBI work when they visited the United Kingdom in 2013. 
“DNAs density, stability and maturity have made us excited about it,’ says 
Strauss. 

And on their return to Washington state, says 
Strauss, she and Ceze started investigations with 
their University of Washington collaborator 
Georg Seelig. One of their chief concerns has been 
another major drawback that goes well beyond 
DNAs vulnerability to errors. Using standard 
sequencing methods, there was no way to retrieve 
any one piece of data without retrieving all the 
data: every DNA string had to be read. That would be vastly more cum- 
bersome than conventional computer memory, which allows for random 
access: the ability to read just the data that a user needs. 

The team outlined its solution in early April at a conference in Atlanta, 
Georgia. The researchers start by withdrawing tiny samples from their 
DNA archive. They then use the polymerase chain reaction (PCR) to 
pinpoint and make more copies of the strings encoding the data they 
want to extract*. The proliferation of copies makes the sequencing faster, 
cheaper and more accurate than previous approaches. The team has also 
devised an alternative error-correction scheme that the group says allows 
for data encoding twice as dense as the EBT, but just as reliable. 

As a demonstration, the Microsoft-University of Washington 
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DNA DATA-ENCODING SCHEMES SUCH AS THIS ONE ARE DESIGNED 
TO MINIMIZE ERRORS IN SYNTHESIZING AND SEQUENCING THE 
MOLECULE — AND THEN CORRECT ANY ERRORS THAT DO OCCUR. 
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STORAGE LIMITS 


Estimates based on bacterial genetics suggest that digital DNA 
could one day rival or exceed today’s storage technology. 
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researchers stored 151kB of images, some encoded using the EBI 
method and some using their new approach, in a single pool of strings. 
They extracted three — a cat, the Sydney opera house and a cartoon mon- 
key — using the EBI-like method, getting one read error that they had to 
correct manually. They also read the Sydney Opera House image using 
their new method, without any mistakes. 


ECONOMICS VERSUS CHEMISTRY 

At the University of Illinois at Urbana-Champaign, computer scientist 
Olgica Milenkovic and her colleagues have developed a random-access 
approach that also enables them to rewrite the encoded data®. Their 
method stores data as long strings of DNA that have address sequences 
at both ends. The researchers then use these addresses to select, amplify 
and rewrite the strings using either PCR or the gene-editing technique 
CRISPR-Cas9. 

The addresses have to avoid sequences that would hamper reading 
while also being different enough from each other to stop them being 
mixed up in the presence of errors. Doing this — and avoiding problems 
such as molecules folding up because their sequences contain stretches 
that recognize and bind to each other — took intense calculations. “At 
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the beginning, we used computer search because it was really difficult 
to come up with something that had all these properties; Milenkovic 
says. Her team has now replaced this labour-intensive process with 
mathematical formulae that allow them to devise an encoding scheme 
much more quickly. 

Other challenges for DNA data storage are scale and speed of synthe- 
sizing the molecules, says Kosuri, who admits that he has not been very 
bullish about the idea for that reason. During the early experiments at 
Harvard, he recalls, “we had 700 kB. Even a 1,000-fold increase on that 
is 700 MB, which is a CD”. Truly making a difference to the worldwide 
data archiving problem would mean storing information by the petabyte 
at least. “It's not impossible,’ says Kosuri, “but people have to realize the 
scale is on the order of million-fold improvements.” 

That will not be easy, agrees Markowitz. “The dominant production 
method is an almost 30-year-old chemical process that takes upwards 
of 400 seconds to add each base,” he says. If this were to remain the 
approach used, he adds, billions of different strings would have to be 
made in parallel for writing to be fast enough. The current maximum 
for simultaneous production is tens of thousands of strings. 

A closely related factor is the cost of synthesizing DNA. It accounted 
for 98% of the expense of the $12,660 EBI experiment. Sequencing 
accounted for only 2%, thanks to a two-millionfold cost reduction since 
the completion of the Human Genome Project in 2003. Despite this 
precedent, Kosuri isn’t convinced that economics can drive the same 
kind of progress in DNA synthesis. “You can easily imagine markets 
to sequence 7 billion people, but there’s no case for building 7 billion 
people’s genomes,’ he says. He concedes that some improvement in costs 
might result from Human Genome Project-Write (HGP-write), a pro- 
ject proposed in June by Church and others. If funded, the programme 
would aim to synthesize an entire human genome: 23 chromosome 
pairs containing 3.2 billion nucleotides. But even if HGP-write succeeds, 
says Kosuri, a human genome contains just 0.75 GB of information and 
would be dwarfed by the challenge of synthesizing practical data stores. 

Zhirnov, however, is optimistic that the cost of synthesis can be orders 
of magnitude below today’s levels. “There are no fundamental reasons 
why it’s high,” he says. 

In April, Microsoft Research made an early move that may help 
create the necessary demand, ordering 10 million strings from Twist 
Biosciences, a DNA synthesis start-up company in San Francisco, Cali- 
fornia. Strauss and her colleagues say they have been using the strings 
to push their random-access storage approach to 0.2 GB. The details 
remain unpublished, but the archive reportedly includes the Universal 
Declaration of Human Rights in more than 100 languages, the top 100 
books of Project Guttenberg and a seed database. Although this is much 
less of a synthesis challenge than the HGP-write faces, Strauss stresses 
the significance of the 250-fold jump in storage capacity. 

“Tt was time to exercise our muscle handling larger volumes of DNA 
to push it to a larger scale and see where the process breaks,” she says. 
“Tt actually breaks in multiple places — and we're learning a great deal 
out of it” 

Goldman is confident that this is just a taste of things to come. “Our 
estimate is that we need 100,000-fold improvements to make the tech- 
nology sing, and we think that’s very credible,” he says. “While past 
performance is no guarantee, there are new reading technologies com- 
ing onstream every year or two. Six orders of magnitude is no big deal 
in genomics. You just wait a bit.” 


Andy Extance is a freelance writer in Exeter, UK. 
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The snakebite fight 


Snakes kill tens of thousands of people each year. But experts can’t agree 
on how best to overcome a desperate shortage of antivenom. 


bdulsalam Nasidi’s phone rang shortly 

after midnight: Nigeria’s health minister 

was on the line. Nasidi, who worked at 

the country’s Federal Ministry of Health, 

learnt that he was needed urgently in the Benue 

valley to investigate a cluster of dying patients. 

People were bleeding out of their noses, their 

mouths, their eyes. Names of spine-chilling 

viruses such as Ebola, Lassa and Marburg raced 
through Nasidi’s mind. 

When he arrived in Benue, he found peo- 

ple splayed on the ground and tents serving 
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as makeshift hospital wards and morgues. But 
Nasidi quickly realized that the cause of the 
mystery illness was millions of times larger 
than any virus. The onset of the rainy season 
had brought the start of spring planting for 
farmers in the valley, and flooding had dis- 
turbed the resident carpet vipers (Echis ocel- 
latus). Many farmers were simply too poor to 
buy boots — and their exposed feet became 
targets for the highly venomous snakes. 
Nasidi wanted to help, but he found him- 
self with limited tools. He had only a small 
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amount of antivenom with which to neutral- 
ize the toxin — and it quickly ran out. Once the 
hospital exhausted its supply, people stopped 
coming. No one knows how many people were 
killed. In an average year, hundreds of Nigeri- 
ans die from snakebite, and that rainy season, 
which started in 2012, was far from average. 
Snakebites are a growing public-health crisis. 
According to the World Health Organization, 
around 5 million people worldwide are bitten 
by snakes each year; more than 100,000 of them 
die and as many as 400,000 endure amputations 


© 2016 Macmillan Publishers Limited, part of Springer Nature. All rights reserved. 


MATTIAS KLUM/NGS 


Bites from venomous 
snakes such as the 
Jameson’s mamba 
(Dendroaspis jamesoni) 
are a public-health crisis. 


and permanent 
disfigurement. Some 
estimates point to 
a higher toll: one 
systematic survey 
concluded that in 
India alone, more than 45,000 people died in 
2005 from snakebite’ — around one-quarter 
the number that died from HIV/AIDS (see ‘The 
toll of snakebite’). “It’s the most neglected of the 
world’s neglected tropical diseases,” says David 
Williams, a toxinologist and herpetologist at the 
University of Melbourne, Australia, and chief 
executive of the non-profit organization Global 
Snakebite Initiative in Herston. 

Many of those bites are treatable with exist- 
ing antivenoms, but there are not enough to go 
around. This long-standing problem became 
international news in September 2015, when 
Médecins Sans Frontiéres (MSE, also known 
as Doctors Without Borders) announced 
that the last remaining vials of the antivenom 
Fav-A frique, used to treat bites from several 
of Africa’s deadliest snakes, were about to 
expire. The French pharma giant Sanofi Pas- 
teur in Lyons had decided to cease production 
in 2014. MSF estimates that this will cause an 
extra 10,000 deaths in Africa each year — an 
“Ebola-scale disaster’, according to Julien Potet, 
a policy adviser for MSF in Paris. Yet, because 
most of those affected by snakebites are in the 
poorest regions of the world, the issue has been 
largely ignored. 


SPOTLIGHT ON SNAKES 
In May, however, the crisis was discussed for the 
first time at the annual World Heath Assembly 
meeting in Geneva, Switzerland. The world’s 
handful of snakebite specialists gathered in 
a small conference room in the Palais des 
Nations — although they shared concern over 
the problem, they were split about how to solve 
it. Many want to use synthetic biology and other 
high-tech tools to develop a new generation of 
broad-spectrum antivenoms. Others argue that 
existing antivenoms are safe, effective and low 
cost, and that the focus should be on improving 
their production, price and use. “From the phy- 
sician perspective, patient care and public health 
comes before anything new,’ says Leslie Boyer, 
who directs an institute dedicated to antivenom 
study at the University of Arizona, Tucson. 

The debate mirrors those around many 
other developing-world challenges, from 
improving agriculture to providing clean 
drinking water. Do people need high- 
tech solutions, or can cheaper, lower-tech 
remedies do the job? The answer is simple to 
Jean-Philippe Chippaux, a physician work- 
ing on snakebite for the French Institute of 
Research for Development in Cotonou, Benin. 
“We have the ability to fix this problem now. 
We just lack the will to do it, he says. 

Every December, Williams sees snakebite 
victims flood into the Port Moresby General 
Hospital in Papua New Guinea. Nearly all of 


them were bitten by the taipan (Oxyuranus 
scutellatus), one of the world’s deadliest snakes, 
which emerges at the start of the rainy season. 
The venom stops a victim's blood from clot- 
ting, paralyses muscles and leads to a slow, ago- 
nizing death. It seems a far cry from Australia, 
where Williams is based. “There's this incred- 
ible suffering just 90 minutes away from the 
modern world,’ he says. 

Yet Williams knows that these people are the 
lucky ones. The hospital ward, which might 
be treating as many as eight taipan victims at 
any time, is often the only place in the country 
with antivenom drugs. Without them, some 
10-15% ofall snakebite victims die; with them, 
just 0.5% do. The situation is reflected around 
the world. “Many countries don’t want to admit 
that they have such a primeval-sounding prob- 
lem,’ Chippaux says. 

The method used to make antivenom has 
changed little since French physician Albert 
Calmette developed it in the 1890s. Research- 
ers inject minuscule amounts of venom, 
milked from snakes, into animals such as 
horses or sheep to stimulate the production 
of antibodies that bind to the toxins and neu- 
tralize them. They gradually increase doses of 
venom until the animal is pumping out huge 
amounts of neutralizing antibodies, which are 
purified from the blood and administered to 
snakebite victims. 

Across much of Latin America, government- 
funded labs typically produce antivenoms and 
distribute them free of charge. But in other 
areas, especially sub-Saharan Africa, these 
life-saving medications are too often out of 


FEATURE | NEWS 


to drink petrol, electrocute themselves or apply 
a poultice of cow dung and water to the bite, 
says Tim Reed, executive director of Health 
Action International in Amsterdam. 

But there are also problems with the drugs 
themselves, says Robert Harrison, head of 
the Alistair Reid Venom Research Unit at the 
Liverpool School of Tropical Medicine, UK. 
They often have a limited shelf life and require 
continuous refrigeration, which is a problem 
in remote areas without electricity. And many 
are effective against just one species of snake, 
so clinics need an array of medicines con- 
stantly on hand. (A few, such as Fav- Afrique, 
combine antibodies to create a broad-spec- 
trum product.) 

Venoms from spiders and scorpions typi- 
cally have only one or two toxic proteins; 
snake venoms can have more than ten times 
that amount. They are a “pandemonium of 
molecules’, says Alejandro Alag6n, a toxinol- 
ogist at the National Autonomous University 
of Mexico in Mexico City. Researchers do not 
always know which proteins in this toxic soup 
are the damaging ones — which is why some 
think that smarter biology could help. 


OLD PROBLEM, NEW SOLUTION 

Ten years ago, teams led by Harrison and José 
Maria Gutiérrez, a toxinologist at the Uni- 
versity of Costa Rica in San José, began par- 
allel efforts to create a universal antivenom 
for sub-Saharan Africa using “venomics’ and 
‘antivenomics. The aim is to identify destruc- 
tive proteins in venoms using an array of tech- 
niques, ranging from genome sequencing to 


“THERE'S THIS INCREDIBLE SUFFERING JUST 
90 MINUTES AWAY FROM THE MODERN WORLD.” 


reach. Many governments lack the infrastruc- 
ture or political will to purchase and distribute 
antivenom. Bribery and corruption often jack 
up the price of an otherwise inexpensive drug 
from a typical wholesale cost of US$18 to $200 
per vial to a retail cost between $40 and $24,000 
for a complete treatment, according to a 2012 
analysis”. Not all hospitals and clinics can afford 
the antivenom, and some won't risk buying it 
because their patients either can’t pay for it or 
wont, because they doubt that it really works. 
With no reliable market for the medicines, 
some pharmaceutical companies have halted 
production. Sanofi Pasteur stopped making 
Fav- Afrique because, at an average retail price 
of around $120 per vial, it just couldn't sell 
enough to make production worthwhile. A 
total of 35 government or commercial manu- 
facturers produce antivenom for distribution 
around the world, but only 5 now make the 
drugs for sub-Saharan Africa. In the absence of 
medicines, snakebite victims have been known 


mass spectrometry, and then find the specific 
parts, known as epitopes, that provoke an 
immunological response and are neutralized 
by the antibodies in antivenom drugs. The 
ultimate goal is to use the epitopes to produce 
antibodies synthetically, using cells rather 
than animals, and develop antivenoms that are 
effective against a wide range of snake species 
in one part of the world. 

The scientists have made slow but steady 
progress. Last year, Gutiérrez and his col- 
leagues separated and identified the most toxic 
proteins from a family of venomous snakes 
known as elapids (Elapidae). By combining 
information about the abundance of each 
protein and how lethal itis to mice, the team cre- 
ated a toxicity score to indicate how important 
it was to neutralize a protein with antivenom, a 
first step towards making the treatment’. 

In March this year, a Brazilian team reported 
that they had gone further, designing short 
pieces of DNA that encode key toxic epitopes 
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THE TOLL OF SNAKEBITE 


Snakebite affects more people each year than many other neglected tropical diseases, 
and often causes death, disability or disfigurement. The issue receives little attention: 
data are scarce and the condition mostly strikes the world's poorest regions. 
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in the venom of the coral snake (Micrurus 
corallinus), a member of the elapid family’. 
Mice were injected with the DNA using a 
technique that enabled some to generate anti- 
bodies against coral-snake venom, and the 
group enhanced the mice’s immune responses 
by injecting them with synthetic antibodies 
manufactured in bacterial cells. These and 
other advances led Harrison to estimate that 
the first trials of new antivenoms in humans 
could be just three or four years away. But with 
so few researchers working on the problem, a 
paucity of funding and the biological complex- 
ity of snake venoms, he and others admit that 
this is an optimistic prediction. 

Despite the growing literature on antivenom- 
ics, Alagon and Chippaux aren't convinced that 
the approach will help. Alagén estimates that 
newly developed antivenoms would need to be 
priced at tens of thousands of dollars per dose 
to be financially viable to produce, and that no 
biotech or pharma company would manufac- 
ture one without substantial government subsi- 
dies. Compare that, he says, to the rock-bottom 
price of many existing antivenoms. “You can’t 
get cheaper than that’ he says. “We can make an 
entire lot of antivenoms in one day using tech- 
nology that’s been available for 80 years” 

Finding someone to produce new medica- 
tions might be a greater challenge than actually 
developing them, Williams acknowledges: gov- 
ernments or non-governmental organizations 
(NGOs) will almost certainly have to step in to 
help to defray the development costs. But he 
argues that now is the time to research alterna- 
tive approaches. These could “revolutionize the 
treatment of snakebite envenoming in the next 
10-15 years’, Williams says. 


THE ROOM WHERE IT HAPPENED 
All these tensions, brewing for nearly a decade, 
came to a head at the Geneva meeting in May. 
Around 75 scientists, public-health experts 
and health-assembly delegates crowded 
around three long tables in a third-floor con- 
ference room at the United Nations Headquar- 
ters. Spring rain pelted the tall windows. 
Lights were dimmed, and then the screams of 
a toddler filled the room. A short documentary 
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co-produced by the Global Snakebite Initiative 
told the story of a girl bitten by a cobra whose 
parents carried her for days over rocky roads 
in Africa to find antivenom. They arrived in 
time — the girl survived — but she lost the use 
of her arm. Her sister had already died after a 
bite from the same snake. 

Convincing attendees of the scale of the 
problem was the meeting’s primary goal; how 
to solve it came next. For 90 minutes, scientists 
and NGOs made short, impassioned speeches 
laying out the scope of the issue and the variety 
of problems that they faced. At the centre of 
each presentation was the same message: we 
need more antivenom. 

But the meeting was strained. Chippaux and 
representatives of the African Society of Ven- 
omology were disappointed and angry that so 
few Africans had been invited to speak, even 
though the continent is where antivenom 
shortages are most acute. “Our voice, our 
issues, were completely overlooked,’ Chippaux 
says. Seated at the front of the room, group 
members whispered and gestured frantically 
to each other, and Chippaux barely managed 
to keep them from storming out. 

They argue that the current antivenom short- 
age stems from Africa's reliance on foreign com- 
panies and governments for its drugs, and that 
the only solution lies in building up infrastruc- 
ture in Africa to produce its own high-quality 
antivenom. Alagén views antivenomics as a 
dangerous diversion. “It’s distracting many 
brilliant minds and resources from improving 
antivenoms using existing technology,’ he says. 
“Perhaps by 2050 this will be the standard tech- 
nique, but the problem is now.” 

Williams and Gutiérrez take a middle ground. 
They feel that the problem requires attacks on all 
fronts. As well as innovation, Gutiérrez calls for 
existing manufacturers to step up the produc- 
tion of current drugs. 

There are signs of this happening already. 
Latin America has a long history of produc- 
ing antivenoms both for its own needs and for 
those of countries around the world, and even 
before Sanofi Pasteur announced that it would 
cease production of Fav-Afrique, Costa Rica, 
Brazil and Mexico were testing antivenoms 
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for different parts of Africa. One product, 
EchiTAb-Plus-ICB, is produced by Costa Rica 
and effective against a range of African viper 
species; it completed clinical trials in 2014 and 
is now available for use. Several other antiven- 
oms are expected to be ready in the next two 
years. The drugs should be affordable: govern- 
ment labs in Costa Rica have already indicated 
that they will not seek to make money from 
the antivenoms, just recoup their expenditures. 

But beyond that, the way forward remains 
murky. Williams knows that the World Heath 
Assembly meeting was just a start. Inevitably, 
more meetings will be needed to produce a con- 
crete action plan. But the discussion still gave 
him and some others a renewed sense of hope 
that the international community is beginning 
to take snakebite seriously — momentum they 
hope to build on by banging away at the topic at 
conferences and in the media. 

Boyer says that whatever solution the snake- 
bite field decides on, the most important thing 
is to “break the cycle of antivenom failure in 
Africa”. Doing that requires building trust 
from governments, health-care workers and 
the public that the drugs are safe and effective, 
that clinics will have antivenom on hand, and 
that people will be able to afford treatment. 
“Without that, you've got nothing,” Boyer 
says. Educating local clinics on how to care for 
snakebite victims and administer treatments 
in a timely manner would also go a long way 
towards preventing deaths. 

Speaking of the devastation he saw in Benue, 
Nasidi says that something as simple as provid- 
ing boots for poor farmers would have helped 
to prevent much of the suffering and death 
that he witnessed. It’s perhaps the ultimate in 
low-tech methods in snakebite protection: 
shielding vulnerable human skin. = 


Carrie Arnold is a writer based near 
Richmond, Virginia. 
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Stop ignoring 


eves 


misconduct 


Efforts to reduce irreproducibility in research 
must also tackle the temptation to cheat, argue 
Donald S. Kornfeld and Sandra L. Titus. 


he history of science shows that 

| irreproducibility is not a product of 
our times. Some 350 years ago, the 
chemist Robert Boyle penned essays on 


“the unsuccessfulness of experiments”. He 
warned readers to be sceptical of reported 


work. “You will meet with several Observa- 
tions and Experiments, which... may upon 
further tryal disappoint your expectation” 
He attributed the problem to a ‘lack of skill 
in the scientist and the lack of purity of 
the ingredients, and what would today be 


1 SE 


referred to as inadequate statistical power. 

By 1830, polymath Charles Babbage was 
writing in more cynical terms. In Reflec- 
tions on the Decline of Science in England, he 
complains of “several species of impositions 
that have been practised in science’, namely 
“hoaxing, forging, trimming and cooking”. 

In other words, irreproducibility is the 
product of two factors: faulty research prac- 
tices and fraud. Yet, in our view, current initia- 
tives to improve science dismiss the second 
factor. For example, leaders at the US National 
Institutes of Health (NIH) stated in 2014: 
“With rare exceptions, we have no evidence to 
suggest that irreproducibility is caused by sci- 
entific misconduct”!. In 2015, a symposium 
of several UK science-funding agencies con- 
vened to address reproducibility, and decided 
to exclude discussion of deliberate fraud. 

To dismiss the role of research miscon- 
duct is mistaken and unfortunate. At best, 
ignoring deliberate misconduct in efforts to 
reduce irreproducibility is a wasted oppor- 
tunity, like tilling a field without clearing 
it of rocks. At worst, it permits destructive 
behaviour to persist and flourish. 


SCALE OF EVIDENCE 

Only 10-12 individuals are found guilty by 
the US Office of Research Integrity (ORI) 
each year. That number, which the NIH 
used to dismiss the role of research mis- 
conduct’, is misleadingly low, as numer- 
ous studies show. For instance, a review’ of 
2,047 life-science papers retracted from 1973 
to 2012 found that around 43% were attrib- 
uted to fraud or suspected fraud. A compi- 
lation of anonymous surveys* suggests that 
2% of scientists and trainees admit that they 
have fabricated, falsified or modified data. 
Anda 1996 study* of more than 1,000 post- 
docs found that more than one-quarter 
would select or omit data to improve their 
chances of receiving grant funding. 

Admittedly, many causes of irreproduc- 
ibility do not involve dishonesty. The NIH 
has promoted responsible research for 
25 years by funding studies on research 
integrity, creating educational resources and 
backing the ORI. 

Nonetheless, we contend that when sci- 
entific leaders minimize “hoaxing, forging, 
trimming and cooking” as contributors to 
irreproducibility, they choose to ignore the 
problem rather than confront it. This 


TEMBER 2016 | VOL 537 | NATURE | 29 


© 2016 Macmillan Publishers Limited, part of Springer Nature. All rights reserved. 


> mechanism is what psychiatrists term 
denial, when an individual faces what they 
believe to be an insoluble problem. Deliber- 
ate misconduct is a reality that government 
funders can and must address. In 2012, an 
article in this journal declared that “the time 
is right to confront misconduct”. We agree; it 
is even more urgent now. We recommend five 
key approaches (see ‘Preventing misconduct’). 


TARGETED REMEDIES 

In the 1990s, the NIH mandated that all of the 
trainees it funds must receive a course on the 
responsible conduct of research. Not surpris- 
ingly, it failed in its goal of reducing research 
misconduct* — which it defines as fabrica- 
tion, falsification or plagiarism. Presumably, 
the ethics proscribing such practices are 
established long before people enter science. 
Instead, we propose interventions to address 
the psychological factors that motivate indi- 
viduals to commit misconduct, depending on 
their role in the research hierarchy. 

Those found guilty of misconduct by the 
ORI fall into three categories in roughly equal 
measure: trainees, support staff and senior 
scientists. Each has its own motivations’. 


Trainees. Many trainee missteps can be 
traced to a fear of failure and a lack of quality 
mentorship. One study’ of trainees who were 
found guilty of misconduct revealed that 62% 
of their mentors had not established adequate 
procedures, such as providing clear rules on 
data ownership and recording, safety, materi- 
als transfer or scheduling regular meetings, 
and 73% had not reviewed trainees’ raw data. 
A survey’ at a major US cancer centre found 
that nearly one-third of 140 trainees felt pres- 
sure to “prove” a mentor’s hypothesis, even 
though results did not support it. 

Some trainees who commit misconduct 
are perfectionists and are unable to cope 
with failure. Mentors should intervene with 
perspective, encouragement and even refer- 
ral to counselling. They should assure train- 
ees that there are respected careers outside 
a tenure-track appointment. Instead, jun- 
ior scientists report that they are treated as 
cheap labour; their professional develop- 
ment is a low priority. 

Funders should craft policies to ensure 
that mentors act as an adviser, teacher and 
role model, and should limit the number of 
trainees per mentor by discipline. Each year, 
trainees should be required to complete 
anonymous questionnaires evaluating their 
mentors, and results should be sent to fund- 
ing agencies as well as to research deans. 

Institutions should reward mentors for 
outstanding performance and provide 
adequate training. When justified, mentors 
should be held responsible for misconduct by 
their trainees and appropriately sanctioned. 


Support staff. Workers such as laboratory 


REMEDIES 
Preventing misconduct 


To diminish the threat that misconduct 
poses to science, scientists and society: 
@ Authorities should acknowledge that 
deliberate misconduct is an important 
contributor to irreproducibility. 

@ Mentors should be evaluated to 
assure quality; those who contribute to 
misconduct should be penalized. 

@ Institutions and government agencies 
should have procedures to protect 
whistle-blowers from retaliation. 

@ Senior faculty members who are 
found guilty of misconduct should face 
severe penalties. 

@ Institutions that fail to establish and 
follow policies and processes to prevent 
misconduct should be sanctioned. 


technicians, phlebotomists and data collec- 
tors represent about one-third of the indi- 
viduals annually found guilty by the ORI of 
deliberate misconduct. They may have fal- 
sified data to boost their income or reduce 
their workload in response to an investiga- 
tor’s unrealistic productivity goals. 

Treating support staff as valued members 
of a team could go a long way. They should 
be made aware of the study’s goals, and how 
invalid publications harm scientific progress 
and patient care. 


Senior researchers. Established scientists 
would be less likely to commit misconduct 
if they were more concerned about being 
detected and punished. Currently, they 
conclude that the risk is low: few cases are 
referred to the ORI and few of their col- 
leagues want to be enmeshed in a conflict. 
More than 80% of faculty members 
say that they would be reluctant to report 
potential misconduct for fear of being ostra- 
cized and damaging their own reputations’. 
One ORI study found that 47 out of 68 peo- 
ple who reported misconduct experienced 
an adverse consequence. Clearly, concerns 
about making allegations are justified. 
Well-articulated policies are key to help- 
ing whistle-blowers come forward. So too isa 
well-trained research integrity officer (RIO), 
ideally a respected faculty member or admin- 
istrator. The potential whistle-blower must 
be confident that the institution’s RIO and its 
policies will protect them from retaliation. 


Institutions. Research centres should build 
a culture and infrastructure that encourages 
integrity. For example, peers can emphasize 
their commitment to robust data in every- 
day interactions and by supporting random 
audits; and data systems can date-stamp and 
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track who accesses files to protect them from 
manipulation. Leaders should make it clear 
that they will not tolerate misconduct and that 
perpetrators will suffer severe consequences. 

An effective RIO is crucial for leading the 
educational and enforcement effort and in 
building trust in the integrity of the institu- 
tion. Unfortunately, studies have shown that 
many RIOs are poorly trained and do not 
manage allegations and investigations of 
research misconduct effectively. Perhaps most 
significantly, they might fail to adequately 
prepare and protect the whistle-blower’. The 
RIO must be selected thoughtfully and pro- 
vided with sufficient authority and support. 

Any institution that receives US federal 
research funds should be required to have 
at least one designated, trained and certi- 
fied RIO who has been assessed by the ORI. 
Moreover, research funds should not be 
released to an institution that cannot dem- 
onstrate current certification. 

Institutions that fail to establish and exe- 
cute policies to assure integrity should be 
held responsible when misconduct occurs. 
For example, in July 2014, Iowa State Univer- 
sity agreed to repay US$496,000 and forego 
$1.4 million in grants after one of its research- 
ers was found guilty of fraud. However, this 
penalty, as well as a prison sentence for the 
fraudster, happened only because a senator 
intervened. That should not be necessary. 

Government officials should be prepared 
to pursue repayments. The threat of such 
penalties should have a chilling effect on 
investigators contemplating research mis- 
conduct, and motivate institutions to estab- 
lish and implement policies that reflect their 
commitment to institutional integrity. 

We believe that these system-wide inter- 
ventions are essential to have an impact on 
the irreproducibility produced by research 
misconduct. = 
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NOOO 


IN RETROSPECT 


A New System of Chemical 


Philosophy 


Philip Ball reflects on the work of John Dalton, father of modern atomic theory. 


isual metaphors are often 
essential in science when 
you can’t see what you're 


studying. The English chemist 
John Dalton, born 250 years ago, 
illustrated his atomic theory using 
wooden spheres (pictured), drilled 
with holes for pins that enabled them 
to be linked into clusters. But there 
are hazards to such mental props. By 
the 1880s, students were so familiar 
with the spheres that one (taught 
by prominent advocate of atomic 
theory Henry Enfield Roscoe) 
declared: “Atoms are round bits of 
wood invented by Mr Dalton” 
Today, the atoms Dalton pro- 
posed in his seminal New System 
of Chemical Philosophy (1808) are 
routinely revealed by microscopy 
and crystallography. They are cor- 
ralled in electromagnetic traps, 
pushed around like marbles using 
scanning probe microscopes, even 
manufactured and monitored one 
at a time in superheavy forms using 
particle accelerators. No one mis- 
takes them for bits of wood. 
Neither did Dalton. He articu- 
lated the ancient idea that matter 
is built from fundamental particles 
in a way that aligned it with the quantitative 
principles of chemical reaction elucidated in 
the late eighteenth century. Those macro- 
scopic rules, he said, stemmed from the sys- 
tematic combination of microscopic bodies: 
solid, massy and hard, as Isaac Newton had 
put it ina phrase Dalton was fond of quoting. 
Yet in a sense, even by the 1880s, atoms 
were still not much more than Dalton’s model 
spheres. Because they remained unobserved, 
several leading scientists refused to accept 
their reality, among them physicist Ernst 
Mach and chemist Wilhelm Ostwald. Some 
considered atoms no more than an heuristic 
convenience: a crutch that the mind could 
use to make sense of chemical transforma- 
tions. That is why, despite Roscoe's misgiv- 
ings that Dalton’s wooden balls might mislead 
students, the balls had a valuable role. They 
showed how visualizing an entity can help 
to cement the concept even while direct 


John Dalton, 


32 | NATURE | VOL 537 | 1 SEPTEMBER 2016 
© 2016 Macmillan Publishers Limited, part of Springer Nature. All rights reserved. 


painted in 1835 by Thomas Philips. 


evidence is elusive. It is a risky strategy to 
assert the physical reality of something not yet 
observed (will dark matter really be particu- 
late?). But without such an image, a theory 
can seem little more than metaphysics. 

It is traditional to locate Dalton’s New 
System of Chemical Philosophy as a step — 
perhaps the greatest — in a long road to mod- 
ern atomic theory that began with the ancient 
Greek atomists Leucippus and Democritus 
in the fifth century Bc, and ended with the 
nuclear atoms proposed by Ernest Ruther- 
ford and Niels Bohr in the early twen- 
tieth century, then quantum theory 
and scanning probe micro- 
scopes. The “philoso- 
phy” in Dalton’s title 
signified something 
closer to a scientific 
theory than to the 
abstract reasoning 


it tends to connote today. Yet his 
book also represents an important 
juncture for the philosophy of sci- 
ence. It spoke to whether science 
should be based on empiricism or 
explanatory hypothesis — a ques- 
tion that had exercised Newton 
and Robert Boyle in the seven- 
teenth century. There was nothing 
new in Dalton’s idea of atomistic 
matter; the question was whether 
to treat this as a useful conjecture 
orasa reality. Antoine Lavoisier, 
whose work on the proportions of 
chemical combination was crucial 
to Dalton, had no time for such 
questions. Lavoisier insisted that 
meditating on “ultimate particles” 
was metaphysical — and fruitless. 
So how did Dalton, a modest 
teacher educated in Cumbrian 
village schools and excluded from 
Oxford and Cambridge for his 
Quakerism, take an imaginative 
leap that eluded distinguished 
professors? Even if we admit some 
of the fairy dust of “genius” into an 
explanation, we shouldnt discount 
Dalton’s wide reading — from 
Boyle and Newton to Claude Louis 
Berthollet and Humphry Davy. He 
also paid careful attention to the quantita- 
tive details of experiments by the likes ofhis 
friend, Mancunian chemist William Henry, 
and Lavoisier. Dalton presented his atom- 
istic theory to the Manchester Literary and 
Philosophical Society, of which he was sec- 
retary, between 1803 and 1805. Some of his 
papers were published in the society’s mem- 
oirs, but he was urged to present them asa 
book, as he put it, in “the interests of science, 
and his own reputation” 
The New System is one of those foun- 
dational books that doesn’t say what 
you might think it should. It is mostly 
not about atoms at all. The first 
140 pages or so of Volume 1 
dwell on heat and its effects, 


The spheres that Dalton 
used to demonstrate 
atomic theory. 
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Philosophy inorganic chemical 


JOHN DALTON 


R Bickerstaff: 1808. compounds. Dalton’s 


atomic theory is con- 

fined to the five-page 
final chapter of the first volume. Here, he 
explains that the fixed stoichiometries of 
chemical reactions — so much of element 
A combines with so much of B — can be 
rationalized by supposing that the constitu- 
ent atoms unite into “compound atoms” of 
simple ratios, such as 1:1 or 1:2. The point 
is most famously and eloquently made in a 
plate that shows sketches of these unions. An 
“atom” of water comprises one atom each of 
hydrogen and oxygen; an atom of ammonia is 
a 1:1 union ofhydrogen and nitrogen (Dalton 
uses Lavoisier’s term, “azote”, for nitrogen). 

The proportions are wrong — chemist Jéns 
Jakob Berzelius corrected many in the follow- 
ing two decades. And in 1813, he proposed 
an alphabetical representation (for example, 
HO [sic]) in place of Dalton’s pictorial balls. 
Dalton, with the conservatism common to 
trailblazers, declared this “horrifying’, saying 
that the symbols “cloud the beauty and sim- 
plicity of the atomic theory”. His displeasure 
might have contributed to a stroke in 1837. 

The New System is not a new theory of 
chemistry. Among other things, it offers no 
explanation for why atoms react. Roscoe 
put his finger on it when he said that the 
significance of Dalton’s theory was his pro- 
posal that each type of atom has a unique 
mass. That made sense of the quantities in 
which elements were found to combine, 
and offered the first general and fundamen- 
tal distinction between one element and the 
next — what eventually became embodied 
in the idea of atomic number. 

Yet it is the idea of atoms as the indivis- 
ible units of matter that stuck in the mind, 
because readers could see them on the 
page. Dalton didn’t intend his pedagogical 
diagrams of atomic unions — “compound 
atoms’, or molecules as we'd now say — to be 
taken too literally. There’s no inkling in his 
book of molecular shape; the arrangements 
of atoms in binary, ternary and other unions 
are purely notional, and when Dalton draws 
“water particles” packed into the crystalline 
forms of ice, they too are spheres. 

All the same, visual representation of atoms 
was surely the precondition for the emer- 
gence of a concept of molecular structure, 
with atoms in fixed spatial relationships, in 
the mid-nineteenth century. Something 
of this kind would surely have appeared 
whether or not Dalton had “invented” atoms 
as wooden balls — but that innovation was 
more eloquent than its inventor anticipated. = 


Philip Ball is a writer based in London. His 
latest book is The Water Kingdom. 
e-mail: p.ball@btinternet.com 
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Reductionism in Art and Brain Science: Bridging the Two Cultures 
Eric R. Kandel COLUMBIA UNIVERSITY PRESS (2016) 

The sea-slug studies of Nobel-prizewinning neuroscientist Eric 
Kandel — which reveal the link between memory and synaptic 
connection — are models of reductionist science. In this intriguing 
treatise, Kandel finds methodological similarities in abstract art. By 
reducing image to colour, form or line, artists such as Piet Mondrian 
stimulated the brain’s “top-down processing” in the viewer, 
encouraging ‘active seeing’. Kandel deconstructs this intricate dance 
between perceiver and perceived by way of recent neuroscience 
findings and deft analyses of seminal artworks. 


Weapons of Math Destruction 

Cathy O’Neil CROWN (2016) 

While working as a Wall Street analyst during the 2008 crash, data 
scientist Cathy O’Neil realized how maths can fuel social problems. 
Her propulsive study reveals many models that are currently 
“micromanaging” the US economy as opaque and riddled with bias. 
These algorithmic overlords can taint policing and court sentences 
with racial profiling, and exacerbate unemployment rates in poor 
communities. In an era when many people uncritically applaud the 
power of big data, O’Neil argues for the dark side of the deluge to be 
tackled through algorithm audits, transparency and legal reform. 
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Citizen Scientist: Searching for Heroes and Hope 

in an Age of Extinction 

Mary Ellen Hannibal THE EXPERIMENT (2016) 

In this inside story on citizen science and biodiversity loss, Mary 
Ellen Hannibal meshes interviews with front-line scientists such 

as James Estes (Nature 533, 318-319; 2016) with her own stints 
monitoring California wildlife. Inspired by the likes of marine 
biologist Ed Ricketts (Nature 516, 326-328; 2014), she records 
starfish die-offs, meets the geeks who track deforestation, and plans 
a web-based supercommunity of citizen scientists to counter what 
many are calling the sixth great extinction. A cogent call to action. 


A Brief History of Everyone Who Ever Lived: The Stories in Our Genes 
Adam Rutherford WEIDENFELD & NICOLSON (2016) 

Fifteen years ago, the first sequence and analysis of the human 
genome was published (E. S. Lander et a/. Nature 409, 860-921; 
2001). A monumental surge in genetics followed. Science writer and 
broadcaster Adam Rutherford rides that tide and traces its effects, 
first focusing on how genetics has enriched and in some cases 
upset our understanding of human evolution, then examining the 
revelations of recent findings, such as deep flaws in the concept of 
race. Although digressive in the chapters on deep history, Rutherford 
unpeels the science with elegance. 


Trillion Dollar Baby 

Paul Cleary BITEBACK (2016) 

Norway’s government pension fund could hit US$1 trillion in just 
four years. In this crisp economic history stretching back more than 
four decades, journalist Paul Cleary charts how this middle-income 
Scandinavian country ensured that 90% of the cash flow from vast 
oil discoveries accrued to its government. But despite its record 

of pragmatic fair-mindedness, Norway’s eagerness to excavate 
environmentally sensitive reaches of the Arctic shows how its 
forward planning fails when it comes to climate change. Barbara Kiser 
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Fallacy of perfection 
harms peer review 


Voltaire wrote in 1772, “the best is 
the enemy of the good’, warning 
against the fallacy that something 
is worthless if it is not perfect —a 
sentiment that seems common in 
scientific peer review today. 

The history of science has 
taught us that most progress has 
come from exploring flawed 
hypotheses and imperfect 
models. We must always strive for 
the better study, the better model, 
the better analysis. As experienced 
reviewers, however, we contend 
that seeking ultimate perfection is 
not the same as accepting nothing 
less here and now. Scientific 
progress depends on such 
compromise — provided that 
potential caveats are recognized. 

Ifa model is the most 
technically and ethically feasible 
approach available, and is better 
than random guessing, then it 
has some merit in advancing 
knowledge. Useful developments 
in biology, for example, have 
come from in vitro systems that 
do not reflect in vivo conditions, 
and from animal models that do 
not necessarily predict human 
disorders. 

The aim should be to 
utilize models, despite their 
imperfections, while continuing 
to improve them. It is unrealistic 
to hold progress in science to 
standards of perfection and 
certainty: progress is usually 
incremental and iterative. 

James C. Zimring 
BloodworksNW; and University 
of Washington, Seattle, USA. 
Steven L. Spitalnik Columbia 
University, New York, USA. 
jzimring@bloodworksnw.org 


Hallmark labs witha 
replicability record 


Iam concerned that the tension 
between good research practice 
and scientific success is rising, 
despite recent efforts to shore up 
replicability (see Nature 
http://doi.org/bpmf; 2016). 

As others have noted, 


high-quality research should 
start in the lab, by validating cell 
lines and reagents, for example, 
and end with serious, meticulous 
review. That rarely happens 
because it is time-consuming, 
and time is every scientist's 
worst enemy — particularly for 
young researchers who face stiff 
competition in the scientific job 
market. 

To resolve this conflict, we need 
to work out how to change the 
incentive system so that it fosters 
a culture of good, responsible 
research. Reproducibility could 
be underpinned by a strict set of 
rules — including, say, systematic 
use of power analysis and sample- 
size estimation. To promote 
compliance and to counter any 
negative effect on productivity, 
and hence on competition for 
funding, labs with a record of 
high-quality research could be 
accredited with an international 
certificate of approval. An 
independent, not-for-profit 
organization might be responsible 
for awarding such certificates. 
Mattia Andreoletti European 
Institute of Oncology, Milan, Italy. 
mattia.andreoletti@ieo.eu 


Stuck between a rock 
and a hard place 


We question the basis of July’s 
ruling by an international 
tribunal that the disputed Spratly 
Islands (Nan-sha Islands) in 
the South China Sea are merely 
rocks (see Nature 535, 334-335; 
2016). 

According to the United 
Nations Convention on the 
Law of the Sea, which is signed 
by 178 countries, an island is 
defined by three elements: it is a 
natural formation; it lies in the 
same territory and economic 
zone as the surrounding sea; and 
it can sustain human habitation 
or have an economic life of its 
own (see go.nature.com/2bj4sit). 

One of the disputed islands, 
Taiping Island, fulfils all three 
criteria, having its own freshwater 
resources (see go.nature. 
com/2biujnv; in Chinese). There 
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is also human habitation and 
economic activity on several of 
the other islands. 

In our view, the international 
court seems to have interpreted 
an island’s third defining 
element as requiring no external 
resources to sustain human 
settlement. If that were the case, 
the Maldives, Singapore and 
Hong Kong should probably 
be considered as rocks — they 
bring in gas and must obtain 
much of their fresh water by 
import and desalinization or 
from rain water. 

Yingchao Hu, Liangjun Hu 
Northeast Normal University, 
Changchun, China. 
hulj068@nenu.edu.cn 


Rate oceans’ capital 
to help achieve SDGs 


Goal 14 of the United Nations’ 
Sustainable Development 
Goals (SDGs) is dedicated 

to conserving and using the 
oceans and their resources for 
sustainable development. We 
suggest that a ‘gross marine 
product’ (GMP) index —a 
measure of the oceans’ natural 
capital — would be invaluable for 
achieving this goal. 

The seas provide us with 
food, materials, livelihoods and 
recreation. Managing these 
ecosystem services effectively 
can help us to eradicate 
poverty, develop sustainable 
economies and adapt to global 
environmental changes. Yet 
international-resource experts 
and national strategies still 
focus largely on goods and 
services delivered by terrestrial 
ecosystems (see go.nature. 
com/2bcqjr0). 

A GMP index would provide 
a measure of marine ecosystem 
goods and services on a national 
or global scale, derived from 
estimates for individual oceans. 
More international research will 
be necessary to underpin these 
estimates. The results would 
inform decision-makers, the 
private sector and the public on 
how they could help to achieve 


goal 14, as well as the 60 targets 
across most of the 17 SDGs that 
are relevant to the sustainable 
development of coastal zones. 
An integrated programme that 
measures, monitors and assesses 
the health of human-ocean 
systems should oversee their 
sustainability. 

Yonglong Lu* Research Center 
for Eco-Environmental Sciences, 
Chinese Academy of Sciences, 
Beijing, China. 

yllu@rcees.ac.cn 

*On behalf of 5 correspondents (see 
go.nature.com/2biddcz for full list). 


Could Pokémon Go 
boost birding? 


In a week when the game 
Pokémon Go topped 15 million 
downloads, I had a salutary 
reminder that urban humans risk 
losing touch with nature — with 
possible negative implications 
for the future of fieldwork in 
conservation and ecology (see 
also Nature 535, 323-324; 2016). 
As I set out to go birdwatching 
in Queensland's rainforest, my 
14-year-old daughter grabbed 
her smartphone to search for rare 
Pokémon in every nearby park, 
beach and town. The Pokémon 
are an extremely speciose 
group that undergo continuous 
evolution and have particular 
ecological needs. Embedded in 
nature by an augmented reality, 
they hold the same naturalistic 
delight for my daughter as a 
cassowary (Casuarius casuarius) 
does for me. 
At the end of my day, I had 
counted three ‘lifers’ (my 
first sightings of Platalea 
regia, Entomyzon cyanotis 
and Nectarinia jugularis) and 
my daughter had spotted 
30 Pokémon. I was delighted 
when she asked me about a bird 
that appeared beside a Pidgey 
on her screen. It was a real 
laughing kookaburra (Dacelo 
novaeguineae). 
Fabio de Oliveira Roque Federal 
University of Mato Grosso do Sul, 
Brazil. 
roque.eco@gmail.com 
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Verifying quantum superpositions at metre 


scales 


ARISING FROM T. Kovachy et a/. Nature 528, 530-533 (2015); doi:10.1038/nature16155 


Although the existence of quantum superpositions of massive particles 
over microscopic separations has been established since the found- 
ing of quantum mechanics, the maintenance of superposition states 
over macroscopic separations is a subject of modern experimental 
tests. Kovachy et al.' report on applying optical pulses to place a freely 
falling Bose-Einstein condensate into a superposition of two trajec- 
tories that separate by an impressive distance of 54 cm before being 
redirected towards one another. When the trajectories overlap, a final 
optical pulse produces interference with high contrast, but random 
phase, between the two wave packets. Contrary to ref. 1, we argue that 
the observed interference is consistent with, but does not prove, the 
claim that the spatially separated atomic ensembles were in a quantum 
superposition state; therefore, the persistence of such superposition 
states remains experimentally unestablished. There is a Reply to 
this Brief Communication Arising by Kovachy, T. et al. Nature 537, 
http://dx.doi.org/10.1038/nature 19109 (2016). 

The authors of ref. 1 equate the observation of interference with the 
existence of a phase-coherent quantum superposition between the sep- 
arated atomic samples. However, Anderson’ hypothetical experiment’, 
which involves two independently produced, ‘non-communicating’ 
volumes of superfluid helium, emphasizes a distinction between inter- 
ference and phase coherence. He pointed out that connecting the two 
volumes by a narrow orifice would result in a Josephson current and that 
the relative phase determined from the Josephson relation would have a 
random value. It is impossible, even in principle, for any measurement 
on a single realization of the set-up to determine whether this phase 
was established before or after the Josephson current was produced’. 

Such thought experiments have been realized in the laboratory, 
demonstrating interference between two independently generated 
light beams? and between two independently produced Bose-Einstein 
condensates*. In both these examples, the spatially separated, indistin- 
guishable quantum objects—photons in one case, sodium atoms in the 
other—had no defined quantum coherence between them. Each sample 
could have interacted with its own local environment and experienced 
uncorrelated perturbations therefrom. Yet, in each repetition of the 
experiment high-contrast interference was observed, whereas the phase 
of the interference was irreproducible between repetitions. The same 
behaviour is observed by Kovachy et al. 

Phase-coherent quantum superposition states are characterized by 
first-order coherence. First-order coherence measures the expectation 
value of a product of two field operators, (2) yp). Ina many-body sys- 
tem, this expectation value appears in the off-diagonal element of the 
one-body reduced density matrix. Such coherence is measured by a 
two-slit interference experiment: the quantum fields emanating from 
two points, A and D, are allowed to interfere. The presence of first-order 
coherence, that is, of quantum superposition states, is indicated by an 
interference pattern with a determinate phase. Although it is possible 
that the random-phase interference observed by Kovachy et al. is 
caused by only technical imperfections in their optical pulses, their 
observation is also consistent with the lack of first-order coherence and 
of coherent quantum superpositions. 

By contrast, second-order coherence measures the expectation value 
of a product of four field operators, (¢\Whwcwp), and is an element 
within the two-body density matrix. Second-order coherence is 


indicated by the fact that the interference pattern produced by two quan- 
tum fields, for example, those emanating from points A and D, is the 
same as that between two other quantum fields, for example, those 
emanating from points B and C. In the experiment of Kovachy et al., the 
points A and B correspond to locations within the gas on the upper 
interferometer path and the points C and D to positions within the gas 
on the lower path. Their observation that the interference phase in one 
portion of the gas is equal to that in another portion of the gas demon- 
strates the existence of second-order, but not first-order, coherence. 

Therefore, we assert that the experiment of Kovachy et al. does not 
demonstrate the existence of quantum superposition states of mas- 
sive particles over metre length scales. The second-order coherence 
observed in the experiment is immune to perturbations that are com- 
mon across the sub-millimetre length scale of each of the spatially 
separated clouds, but that differ arbitrarily over the metre-scale dis- 
tance between the two paths of their atomic interferometer. Such per- 
turbations can arise from technical imperfections or intrinsic atomic 
interactions. In addition, and directly relevant to the claims made by 
Kovachy et al., these perturbations would arise from exotic effects 
proposed in theories of continuous spontaneous localization or grav- 
itationally induced decoherence®®. Indeed, if these exotic localization 
effects localize particles with metre-scale resolution and also respect 
the indistinguishability of identical quantum particles, then they could 
collapse the states onto an incoherent mixture that has a definite mass 
in each arm and thus determine the exact number of atoms in each 
interferometer path. Such localization would completely eliminate 
first-order coherence between the two interferometer paths, so that 
the one-body density matrix becomes that of a mixed state®. The state 
produced by such localization is a ‘quantum superposition only insofar 
as it is composed of identical bosons, the wavefunctions of which 
must be symmetric under particle exchange. Yet, as in Anderson's 
thought experiment’, the two atomic wave packets will still show high- 
contrast interference, with an interference phase that is random 
between experiments”®. We note that our argument contradicts the 
claim by Nimmrichter and Hornberger’ that single-shot measurements 
of atom interferometers serve to test their phenomenological model for 
the decay of macroscopic quantum superposition states. 

The second-order coherence observed by Kovachy et al. does demon- 
strate that each of the separated Bose-Einstein condensates remains 
coherent over its sub-millimetre size during the 1-s time of propaga- 
tion. Matter-wave coherence over similar timescales and length scales 
has been observed previously, as summarized in extended data figure 3 
and extended data table 1 of ref. 1. In those previous works, the exist- 
ence of a determinate phase is confirmed by comparing the phases of 
two well-separated atomic interferometers, allowing for the elimination 
of common-mode technical noise. The observations of Kovachy et al. 
do rule out exotic effects that would cause the position of each and 
every individual atom to be independently measured with metre-scale 
resolution. However, such effects would violate the principle of the 
indistinguishability of identical particles, and are thus implausible (as 
discussed in ref. 6). 

Verification of a quantum mechanical superposition requires the 
measurement of a determinate phase to distinguish a pure quantum 
state from a statistical mixture of several pure states. Once information 
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about the phase is lost, whether owing to measurement noise, interac- 
tions with the environment, or a fundamental source of decoherence, 
no further measurement can distinguish between a quantum super- 
position and a mixed state. Without determinate phase information, 
the system in ref. 1 is consistent with being in a statistical mixture of 
interferometer states. 

If the system examined by Kovachy et al. does indeed retain quan- 
tum coherence over long timescales and length scales, then evidence 
for such coherence could be obtained either by better phase stabili- 
zation of the optical pulses or, if that is impractical, by operating two 
well-separated interferometers that share the same optical pulses’. 
These improvements would enable the impressive technical advance 
in atom interferometry reported in ref. 1 to become a test of quantum 
physics at long length scales. 


D. M. Stamper-Kurn??, G. E. Marti? & H. Miiller? 

1Department of Physics, University of California, Berkeley, California 
94720, USA. 

2Materials Sciences Division, Lawrence Berkeley National Laboratory, 
Berkeley, California 94720, USA. 

email: dmsk@berkeley.edu 

3JILA, National Institute of Standards and Technology and University of 
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REPLYING TO D. M. Stamper-Kurn, G. E. Marti & H. Muller Nature 537, http://dx.doi.org/10.1038/nature19108 (2016) 


In the accompanying Comment’, Stamper-Kurn et al. assert that our 
observation of interference contrast in a half-metre-scale atom interfer- 
ometer? does not prove the existence of macroscopic quantum super- 
positions and, hence, does not test quantum mechanics at long length 
scales. Moreover, they imply that intrinsic atomic interactions or tech- 
nical imperfections could prevent the application of our work to future 
differential measurements. In response, we argue: (i) that in standard 
quantum mechanics there is no known mechanism in our system that 
prohibits its use in future differential measurement applications; 
(ii) that our experiment tests quantum mechanics in that it constrains 
any modifications that would reduce contrast in an interferometer with 
arms that propagate over widely separated trajectories; and (iii) that, 
using a standard definition of superposition, our observation of inter- 
ference results from quantum superposition at the half-metre scale. 
In particular, we argue that quantum superposition is a more general 
concept than first-order coherence. 

We operated our atom source with a condensate fraction of 
approximately 50%. The atom source has a coherence length of only 
2x 10~°m, substantially smaller than the spatial extent of the atom 
cloud. This short coherence length arises from imperfections in the 
magnetic lensing, the lattice launch, and the interactions of the atoms 
with the Bragg laser beams. Coherence between the two interferometer 
arms is established by the initial beam-splitter pulse, at which time the 
ratio of the interaction matrix element (U) to the Bragg-transition Rabi 
frequency (J) is U/J~ 1078, which rules out interaction-based effects 
during the beam-splitter pulse*. The atomic density is no larger than 
about 10'°cm~? during the interferometer sequence, which is dilute 
enough to prevent dephasing due to mean-field shifts (the mean-field 
shift is approximately 0.1 Hz). Under these conditions, standard quan- 
tum mechanics rules out evolution into the state described in ref. 1. 
Furthermore, we know of no technical noise sources that would lead 
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to the emergence of such states; all known technical noise sources, 
such as residual spontaneous emission, are associated with momentum 
exchange that modifies the structure of the atomic states and reduces 
contrast. Therefore, there is no known mechanism that would prohibit 
the utilization of the acceleration sensitivity inferred from the large arm 
separation in differential measurement applications, such as using dual 
species interferometry for a test of the equivalence principle’. 

When evaluating the degree to which our experiment constrains a 
particular hypothetical modification of quantum mechanics, it is impor- 
tant to consider disturbances to the states of individual atoms—for 
example, due to momentum exchange that fundamentally alters the 
structure of the many-atom state as it propagates. In the case of momen- 
tum exchange, the large spatial separation directly translates into an 
increased sensitivity to this spurious heating. A spurious momentum 
kick hq (where h is the reduced Planck constant and q is the wave num- 
ber associated with the momentum kick) that occurs midway through 
the interferometer is associated with a wave-packet phase shift of 
[m(v + hq/m)?/2 — mv?/2]T/h = qL, where m is the atomic mass, 
v is the velocity separation, T is the drift time and L~vT is the 
wave-packet separation. Even momentum kicks as small as q27/L 
(corresponding to wavelength of about L) result in phase shifts of 
around 27, which, if they occur inhomogeneously, result in reduced 
contrast. Modifications that add only overall phase noise are not ruled 
out by our results. 

We would like to clarify our use of the word ‘superposition In ref. 2, 
following Feynman and others, we adopted the nomenclature that 
interference—whether or not there is a determinate phase—necessarily 
results from superposition (see, for example, refs 5 and 6). This view of 
superposition is illustrated by the Pfleegor-Mandel experiment’, which 
tracks the build-up of an interference pattern from two independent 
laser beams one photon at a time. A standard interpretation of these 
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Questioning Holocene community shifts 


ARISING FROM S. K. Lyons et al. Nature 529, 80-83 (2016); doi:10.1038/nature16447 


Demonstrating changes in the structure of plant and animal commu- 
nities over geological time scales and linking these changes to human 
impacts in the Holocene epoch would be an important contribution to 
the fields of ecology and conservation biology’. Lyons et al.” claim to 
provide such evidence based on a decrease in the proportion of spatially 
aggregated species pairs, using co-occurrence data from assemblages 
spanning the past 300 million years. However, we suggest that apparent 
flaws in their predictions, assumptions, methods and interpretations 
undermine this claim, and we question the conclusion that the structure 
of communities has fundamentally changed during the Holocene. There 
is a Reply to this Brief Communication Arising by Lyons, S. K. et al. 
Nature 537, http://dx.doi.org/10.1038/nature19111 (2016). 

Lyons et al.” calculated the proportion of aggregated species pairs 
over the total number of significantly aggregated and segregated species 
pairs for all assemblages. Their conclusions are based on a linear seg- 
mented regression of the proportion of aggregated pairs against time. 
We believe that a linear model cannot account for the estimate errors 
following a binomial distribution. The Gaussian model makes the 
assumption that the variance of proportions is constant. Considering 
the residuals of the linear Gaussian model as a function of the number 
of species pairs, we observe that the variance decreases when the num- 
ber of species pairs increases (S1). Indeed, in this study’, a proportion 
of 50% of aggregated species could have been calculated either based on 
one aggregated and one segregated species pair, or based on 100 aggre- 
gated and 100 segregated pairs. The reliability of the estimate is clearly 
not the same. In total, 44% of the proportions are based on 5 or less 
species pairs from assemblages with several thousand random species 
pairs. Accounting for the number of species pairs using a generalized 
linear model (GLM; binomial error distribution, logit link), we found 
that the proportion of aggregated species pairs decreased indeed over 
time (GLM, \7(1,99) = 229, P < 0.0001), yet the segmentation did not 
reveal any breakpoint at —6,000 years (Fig. 1a). 

Commenting on shifts in community structure in a meaningful 
way involves comparing the comparable—that is, communities of the 
same taxonomic group within the same broad geographical area over 
time. However, Lyons et al.” estimated the proportion of aggregated 
species pairs in 101 assemblages over a time span of 300 million years 
with numerous confounding factors such as taxonomic group, number 
of species, temporal extent or spatial grain. For example, community 
structure in 290-million-year-old plant assemblages with data spanning 
16 million years, in which some species in a ‘community’ are unlikely 
to have actually co-existed, was compared with 10-year-old mammal 
assemblages. This is not inherently wrong, but the confounding varia- 
bles should be included in the model. Yet, Lyons et al.” did not add these 
predictors nor did they test for potential interactions among them. 

Instead, they used univariate correlations to test for an effect of four 
continuous variables on the proportion of aggregated species pairs, 
omitting 39 ‘modern’ assemblages (less than 100 years old) from 
these tests. Without these modern data, their linear model would no 
longer show any breakpoint, and in addition there would be no signifi- 
cant decline in the proportion of aggregated species pairs over time 
(Foi,60) = 1.72, P=0.19, analysis of variance (ANOVA)). 

Working with the same dataset as the authors, excluding modern 
data for which information on several of these potentially confound- 
ing factors was not available, we analysed the effect of time on the 
three identifiable taxa. We grouped the data into three time periods 


(ancient, medium and modern) and found a significant interaction: 
the effect of time varies across taxa (GLM, interaction taxon: time 
period, \7(3,54) =32.3, P< 0.0001; Fig. 1b). Given that these taxonomic 
groups were not represented over the entire 300 million years con- 
sidered (no medium time data for plants, almost exclusively medium 
time data for pollen), it is reasonable to consider to what extent the 
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Figure 1 | Models of community structure over time: three types 

of possible bias. a, Proportion of aggregated species pairs over time, 
modelled using a linear segmented regression showing a breakpoint 
(model of Lyons et al.) compared to a GLM with binomial error. LM, 
linear model. b, GLM of the proportion of aggregated species pairs by time 
period in interaction with the taxonomic group. Widths of the boxplots 
are proportional to the number of points per group and the boxplots are 
centred on the mean of the time period per group. c, Correlation of time 
before present and temporal extent for mammal data. 
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effect of taxonomic group, and the inclusion of modern data whose 
taxonomic groups we do not know, determine the detection of a 
breakpoint. 

To test whether the proportion of aggregated species pairs declines 
over time when additional confounding factors are taken into account, 
we focused on mammals because they are the only group with at least 
four data points for each of the time periods (ancient, medium and 
modern). We found that ‘temporal extent’—the maximum amount of 
time encompassed by a dataset—was strongly correlated to ‘time before 
present’ (R?=0.98, F(,25) = 1,325, P< 0.0001). It is impossible to deter- 
mine which of the two variables (temporal resolution or time before 
present) causes the decrease in the proportion of aggregated species 
pairs. Therefore, the data of Lyons et al.” did not provide evidence for a 
decreasing proportion of aggregated species pairs over geological time 
scales. We propose as an alternative hypothesis that the confounding 
variable, temporal extent, may act on the proportion of aggregated spe- 
cies pairs in the following manner: by associating several communities 
over an increasingly long time window (up to 16 million years), the 
probability of obtaining positive associations by chance might increase. 
This could be the case if the same environmental features caused suc- 
cessions of different species to have similar geographical distributions 
although they never actually co-occurred. 

The conclusions of Lyons et al.” rest on the assumption that natu- 
ral communities of different taxonomic groups, spatial and temporal 
extents or geographical locations are comparable across time periods. 
We disagree and believe that none of their findings supports their claim 
that Holocene shifts in the assembly of plant and animal communities 
have occurred and implicate human impacts. We urge greater caution 
in conducting and interpreting community analyses over geologi- 
cal time scales. Robust generalizations about those shifts in driving 
forces of community structure will be possible only after incorpo- 
rating a much higher number of assemblages, especially around the 
critical period encompassing a putative breakpoint. The study design 
should be based on the question that is addressed. If the authors aim 
to identify shifts in community structure due to human impacts, they 
should sample a greater number of assemblages in the Holocene and 


Lyons et al. reply 


Anthropocene? when critical shifts in manmade environmental mod- 
ifications occurred. It is unclear how ancient data (as old as 300 million 
years) would help to address this question. Crucially, future studies 
should tease apart impacts of confounding factors potentially influ- 
encing community structure. 


Methods 

Using a GLM (binomial error, logit link), we modelled the proportion of aggregated 
species pairs as a function of time and tested for a breakpoint using the R package 
‘segmented’ (following ref. 2; Fig. 1a). We tested the interaction of time (treated 
as a factor with three groups) and taxonomic groups using a GLM without the 
‘modern’ dataset (Fig. 1b). Using a GLM, we tested for a confounding effect of 
temporal extent and time. We modelled the proportion of aggregated species pairs 
as a function of temporal extent (Fig. 1c) and tested whether temporal extent and 
time were correlated (Pearson's r). 
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REPLYING TO C. Bertelsmeier & S. Ollier Nature 537, http://dx.doi.org/10.1038/nature19110 (2016) 


In the accompanying Comment', Bertelsmeier and Ollier criticize our 
statistical analyses and question our conclusion that there was a shift 
in community structure in the mid-Holocene epoch’. The critique is 
based on a generalized linear model (GLM) using a binomial error 
distribution and logit link. We question the validity of the analyses 
made by Bertelsmeier and Ollier’ for several reasons. 

First, they suggest that our data’ are better modelled by a GLM 
using a binomial distribution than by the normal distribution used 
in ordinary least squares (OLS) regression. However, in calculating 
model error, the GLM assumes a binomial error distribution for each 
counted Bernoulli event (here, species pairs in an assemblage), and 
that these are mutually independent. However, pairs are not inde- 
pendent of one another because the same species occurs in multiple 
pairs. The error distribution of species pairs across an assemblage 
is unknown, but it is not modelled by a simple binomial, in part 
because there are three possible outcomes (aggregated, segregated, 
or neither). Our linear model (LM) is simpler because it estimates 
the variance (error term) using deviations from the regression line 
and it fits the data better. By contrast, the binomial GLM makes 
strong assumptions about independence and error mean-variance 
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relationships. Using the same subset of the data, we found that Akaike 
information criterion (AIC) values provide much stronger support 
for a breakpoint analysis using OLS regression (74.6) than using a 
GLM (637.0). 

Second, Bertelsmeier and Ollier' argue that datasets with only a 
few significant pairs should be excluded because those estimates are 
unreliable. They ignore the fact that significant pairs were identified 
using a null model that preserves row and column totals (species 
richness per site, and incidence per species) in the null assemblages. 
Aggregation must appear above and beyond the values expected from 
the number of species collected per sample. Finding limited signifi- 
cant aggregations is not an unreliable estimate, but a robust result of 
the null model. 

Third, in re-testing for a Holocene shift, Bertelsmeier and Ollier! 
exclude most modern datasets, subset taxonomic groups, and break 
the data into three arbitrary time bins that bear no correspondence 
to accepted geological time periods’. Their ‘ancient’ is 300 million 
years ago to 1 million years ago, which encompasses two of the 
largest recorded mass extinctions, the break-up of Pangaea, consider- 
able changes in climate, the rise of angiosperms and a change from 
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dinosaur- to mammal-dominated terrestrial ecosystems. Moreover, 
their definitions merge all Pleistocene and non-modern Holocene data 
into a single bin, obscuring a Holocene shift. 

Using a two-way analysis of variance (ANOVA), they conclude that 
“the effect of time varies across taxa” and maintain that well-character- 
ized plant and pollen assemblages should have been excluded because 
of their arbitrary time bins. Frankly, we are more impressed with simi- 
larities among the taxonomic subsets. All three subsets show a decrease 
in aggregations towards the modern; none shows a flat or increasing 
relationship, and pollen shows a mid-Holocene shift consistent with 
our previous conclusions’. 

Fourth, they propose that long temporal extents in older data- 
sets increase aggregated pairs. We previously demonstrated that 
temporal extent cannot explain these results (see figure 2 in ref. 2). 
We further test this here by collapsing 32 time series into 8 time- 
averaged assemblages (North American mammals (3 datasets), Kenyan 
mammals (2 datasets), South African mammals (2 datasets), North 
American pollen: 0-7,000 years (8 datasets), 8,000-14,000 years 
(7 datasets), 15,000-20,000 (6 datasets) years and 65 million years ago 
(2 datasets), and Palaeocene-Eocene thermal maximum (PETM) plants 
(2 datasets)) with increased temporal extents and rerunning Pairs*. Of 
these, 22 showed a decrease in significant aggregations, 9 showed an 
increase and 1 showed no change. Time-averaging does not necessarily 
increase aggregations. 

In summary, the critique of Bertelsmeier and Ollier! depends on (1) 
the use of a GLM for non-independent proportional data that do not 
follow a binomial distribution; (2) a fundamental misunderstanding 
of the fossil record and geological time; and (3) selective use of our 
data. Bertelsmeier and Ollier! regard our use of fossil datasets repre- 
senting different taxonomic groups, spatial and temporal extents, and 
geographical locations as inappropriate for demonstrating a Holocene 
shift in community structure. We regard our diverse datasets as good 
support for the strength of the signal we detected, which emerged in 
spite of this diversity. Contrary to the assertion by Bertelsmeier and 
Ollier', finding a recent shift in an ancient pattern that has been stable 
for 300 million years is entirely relevant to disentangling the effects of 
humans and natural causes on community structure. We stand by our 
original analyses and conclusions. 

The author order of this Reply reflects the relative contributions of 
the authors, and is different from that in ref. 2. Author D.W. did not 
participate in this response. 
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ALZHEIMER’S DISEASE 


Attack on amyloid-f protein 


An antibody therapy markedly reduces aggregates of amyloid-f,, the hallmark protein of Alzheimer’s disease, and might 
slow cognitive decline in patients. Confirmation of a cognitive benefit would be a game-changer. SEE ARTICLE P.50 


ERIC M. REIMAN 


protein was proposed as the trigger for a 

cascade of events in the brain that lead to 
Alzheimer’s disease. A growing number of 
anti-AB treatments have been developed to 
short-circuit this cascade — and several are 
currently being evaluated in people who have 
already developed or are at risk of developing 
symptoms of Alzheimer’s (www.alzforum.org/ 
therapeutics). On page 50, Sevigny et al.” report 
findings from an initial 12-month, placebo- 
controlled trial of the antibody aducanumab, 
which selectively binds to potentially harmful 
soluble and insoluble Af aggregates, respec- 
tively called AB oligomers and fibrils. 

The trial was primarily intended to clarify 
the AB-fibril-reducing effects and safety of 
different aducanumab doses administered 
intravenously once a month. It involved peo- 
ple who had been diagnosed with mild cogni- 
tive impairment (non-disabling memory and 
thinking problems) or mild dementia (which 
did have a slightly disabling effect) due to 
Alzheimer’s disease. Each of the participants 
also tested positive for AB in a positron emis- 
sion tomography (PET) scan, indicating mod- 
erate to frequent build-up of fibril-containing 
plaques — a cardinal feature of the disease. The 
study was not designed to definitively address 
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aducanumab’ effect on cognitive decline. 

Aducanumab treatment was associated 
with unusually striking, progressive, dose- 
dependent reductions in PET measurements 
of AB-plaque burden. Aducanumab was also 
presumed to bind to and remove harder-to- 
measure Af oligomers, which seem to accu- 
mulate at or near plaques and may be the 
more damaging of the two aggregates’. What's 
more, despite the relatively small number of 
study participants and the substantial extent to 
which the disease has progressed by the time 
people with Alzheimer’s develop memory and 
thinking problems, exploratory analyses sug- 
gested that higher antibody doses and greater 
A®-plaque reductions were associated with 
slower cognitive decline. If these preliminary 
cognitive findings are confirmed in larger 
and more-definitive clinical trials, which 
are now under way, it would provide a shot 
in the arm in the fight against Alzheimer’s 
disease and compelling support for the 
amyloid hypothesis. 

The amyloid hypothesis contends that a 
42-amino-acid form of AB (AB,,) becomes 
harmful when, owing to its overproduction or 
reduced clearance from the brain, individual 
AB. monomers come together in various 
numbers and conformations to form oligo- 
mers and fibrils. These AB, aggregates trigger 
a cascade of neurobiological events, including: 


Tau aggregates 
and tangles 


ML y- 


Synaptic and 
neuronal loss 


certain inflammatory responses; aggregation, 
phosphorylation and propagation of a pro- 
tein called tau; and other neuronal changes. 
These events contribute to the formation of 
Af plaques and tau-containing tangles, loss of 
neurons and the synaptic connections between 
them, cognitive decline and disability, and 
other features of Alzheimer’s disease (Fig. 1). 
Proponents of the amyloid hypothesis cite 
an abundance of supporting evidence’. Others 
note that the evidence is largely circumstantial, 
and that questions remain about the offending 
Af species and its effects. As such, they wonder 
whether AB,, accumulation is a consequence 
rather than an initiator of disease, and worry 
that anti-AB drug development might lead to a 
dead end. What will it take to confirm or refute 
the amyloid hypothesis once and for all? 
Confirmation of this hypothesis will require 
definitive evidence that an anti-Af treatment 
can reduce cognitive decline in people affected 
by orat risk of developing Alzheimer’s disease. 
Sevigny and colleagues’ trial provides con- 
vincing evidence that aducanumab can enter 
the brain, target AB fibrils and substantially 
reverse plaque deposition — a major advance. 
But although the authors’ additional cognitive 
findings are encouraging, they are not defini- 
tive. It would be prudent to withhold judge- 
ment about aducanumab’s cognitive benefit 
until results from the larger trials are in. It will 


Inflammatory 
responses 


Cognitive decline 
and disability 


Figure 1 | The amyloid hypothesis. This hypothesis contends that increases 
in the amyloid-f 42 (Af,,) protein trigger a cascade of events in the brain 
that lead to Alzheimer’s disease. Under this hypothesis, individual AB, 
monomers aggregate into damaging oligomers and fibrils in or near AB, 
plaques. Af aggregates cause certain inflammatory responses. Through 
unknown mechanisms, these events lead to the aggregation, phosphorylation 
and propagation of tau, a protein that is associated with microtubules (pink 
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and purple) and is the main constituent of harmful tangles. Affected neurons 
and synapses become dysfunctional and can die, leading to additional 
inflammatory responses. The progressive dysfunction, degeneration and loss 
of affected neurons and synapses is associated with cognitive decline, other 
symptoms of Alzheimer’s disease and increasing disability. Sevigny et al.’ 
report that the AB-binding antibody aducanumab binds to, promotes removal 
and blocks accumulation of fibrils and oligomers. 
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also be useful to see what can be learnt from 
large trials of other anti-AB treatments in the 
coming months and years. 

Conversely, refutation of the amyloid 
hypothesis will require failure of anti-AB 
treatments to reduce cognitive decline in suffi- 
ciently large and suitably designed trials — not 
only in people with cognitive impairment due 
to Alzheimer’s disease, but also in people with- 
out such impairment who have evidence of 
Af plaques, and even people without impair- 
ment who are at genetic risk of developing 
Alzheimer’s but have little or no AB deposi- 
tion. Several prevention trials using anti-AB 
treatments have started’, and more are on 
the way. Because abnormal AB build-up can 
begin more than two decades before the onset 
of memory and thinking problems’, having a 
drug such as aducanumab that substantially 
reverses pre-existing AB deposition might 
increase the chances of extinguishing the 
disease even after it has set in. 

What accounts for aducanumab’s unusually 
pronounced plaque-busting effects, even in 
small doses and despite the fact that only one 
to two antibody molecules out of every thou- 
sand are thought’ to cross the blood-brain 
barrier? It might be a combination of three 
things: the drug’s unusually high selectivity for 
AB» fibrils and oligomers, which minimizes 
the number of antibody molecules that bind to 
the abundant AB monomers in the blood and 
so maximizes the number of unbound anti- 
bodies that can enter the brain; its unusually 
high affinity for AB,, fibrils and oligomers; and 
the mechanism by which it enlists microglia, 
the brain’s principal immune cells, to engulf 
and clear Af fibrils. 

On the one hand, aducanumab’s microglia- 
mediated activity could account for the anti- 
body’s ability to remove plaques, rather than 
just to slow further AB accumulation (which 
would be valuable in its own right). On the 
other hand, this activity might increase the 
chance of people developing amyloid-related 
imaging abnormalities (ARIA) — defects 
characterized by evidence of brain-fluid 
accumulation in magnetic resonance imag- 
ing scans. Like certain other anti-A6 anti- 
body treatments’, Sevigny and colleagues’ 
study found that aducanumab was more 
likely to cause ARIA in higher doses and 
in people who carry the APOE type 4 gene, 
which is the major genetic risk factor for 
Alzheimer’s disease. 

The authors observed that ARIA were some- 
times associated with transient headaches, vis- 
ual disturbances or confusion, but were often 
associated with no symptoms, and that symp- 
toms typically resolved within one to three 
months. Nonetheless, the frequency of ARIA 
caused the researchers to limit the maximum 
dose studied. It will be important to establish 
a sweet spot: a dose that is sufficiently safe and 
well tolerated, but also effective. 

In addition to confirming the amyloid 


hypothesis, finding that the effects of 
treatments such as aducanumab on Af or 
other biological measurements of Alzheimer’s 
disease are associated with a cognitive benefit 
might help to accelerate the evaluation and 
regulatory approval of promising Alzheimer’s- 
prevention therapies that are based on reduc- 
ing the biological measurements alone’. 
Indeed, confirmation that an anti- Af treat- 
ment slows cognitive decline would be a 
game-changer for how we understand, treat 
and prevent Alzheimer’s disease. Now is the 
time to find out. m 
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Cometary dust under 
the microscope 


The Rosetta spacecraft made history by successfully orbiting a comet. Data from 
the craft now reveal the structure of the comet’s dust particles, shedding light on 
the processes that form planetary systems. SEE LETTER P.73 


LUDMILLA KOLOKOLOVA 


lanetary systems such as the Solar 
Pp System were built from dust in proto- 

planetary nebulae, the clouds of gas 
and dust in which stars and planets are born. 
These dust particles collided, stuck together 
and eventually formed planetesimals, the 
building blocks of planets. Comets are leftover 
planetesimals, made of ice and dust, and range 
from hundreds of metres to tens of kilometres 
in diameter. They spend most of their lives on 
the outskirts of the Solar System — away from 
damaging radiation and high temperatures, 
and avoiding collisions with other objects — 
thus preserving the material that originally 
formed the protoplanetary nebula. By study- 
ing comets, we can learn about the processes 
that gave rise to the Solar System, even though 
those processes happened almost five billion 
years ago’. On page 73, Bentley et al.” show 
that cometary dust particles are formed from 
a hierarchical assembly of smaller constitu- 
ents, a discovery that has implications for our 
understanding of the formation and evolution 
of planetary systems. 

Because we cannot catch a comet and study 
it in the laboratory, previous analyses have 
inferred the properties of cometary dust par- 
ticles from their interactions with sunlight. 
One of the earliest such analyses* indicated 
that these particles are not solid, compact 
objects, but loosely packed aggregates of tiny 
(sub-micrometre diameter) grains. An aggre- 
gate structure was also found in interplanetary 


dust particles (IDPs) collected in Earth’s upper 
atmosphere. Many of these IDPs were found to 
have originated from comets’. 

More evidence for the aggregate nature of 
cometary dust came from the Stardust space- 
craft, which collected dust particles during its 
close fly-by° of the comet Wild 2. However, 
neither the IDPs nor the Stardust samples 
were unmodified (pristine). The IDPs would 
have been affected by their long exposure to 
solar radiation, any collisions with other dust 
particles and interactions with Earth’s atmos- 
phere. In the case of the Stardust samples, the 
spacecraft collected dust particles at a distance 
of hundreds and even thousands of kilometres 
from the comet’s surface — and as the parti- 
cles travelled between the two, their properties 
would have changed as a result of evaporation 
of volatile components, possible destruction of 
complex organic compounds and fragmenta- 
tion of the particles themselves. 

The IDPs and Stardust samples were also 
damaged, or even completely shattered, 
during collection. In the case of the Stardust 
samples, because the dust particles were 
travelling at a speed of 6.1 kilometres per sec- 
ond relative to the spacecraft, the particles were 
damaged by the impact with collecting cells in 
the Stardust sample collector. As a result, the 
aggregate structure of the particles was not 
measured directly. Instead, the structure was 
inferred from the complex shape of the tracks 
that the particles produced while crossing the 
aerogel — a low-density material — in the 
cells, or from the impact craters they left on 
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Figure 1 | Dust from comet 67P/Churyumov-Gerasimenko. a, A dust particle collected by the Cometary Secondary Ion Mass Analyser (COSIMA)’. The 
image shows that cometary dust particles are aggregates of grains with diameters larger than a few micrometres. b, Bentley et al.’ use an atomic force microscope 
to measure the size, shape and texture of cometary dust with greater resolution than COSIMA. The 3D image shows an example of one of the dust particles 
analysed. The authors’ results indicate that the COSIMA grains are built from even smaller, sub-micrometre grains. When combined with the data from 
COSIMA, Bentley and colleagues’ findings show that cometary dust particles are created by the hierarchical assembly of smaller constituents. 


the aluminium foil that separated the aerogel 
cells®. These limitations in data collection 
and analysis left the cometary dust particles’ 
pristine structure undetermined. 

A unique opportunity to study pristine 
dust particles arose when the Rosetta space- 
craft came within tens of kilometres of comet 
67P/Churyumov-Gerasimenko and obtained 
samples of cometary dust from this distance. 
These dust particles were first analysed using 
an instrument called the Cometary Secondary 
Ion Mass Analyser (COSIMA), which collected 
and imaged aggregate particles hundreds of 
micrometres in diameter (Fig. 1a). The images 
from COSIMA revealed that the particles were 
aggregates of grains with diameters larger than 
a few micrometres’, contradicting the earlier 
studies of IDPs and Stardust samples that 
indicated sub-micrometre grains. 

Bentley and colleagues resolve this con- 
tradiction using data from Rosetta’s Micro- 
Imaging Dust Analysis System (MIDAS). 
MIDAS is an atomic force microscope — it 
scans the collected dust particles using a 
sharp, needle-like tip, which provides a 3D 
image of the particles with a maximum resolu- 
tion of about 4nm (Fig. 1b). The images from 
MIDAS show that the grains seen by COSIMA 
are built from even smaller, sub-micrometre 
grains. The authors’ discovery not only proves 
that the basic building blocks of cometary dust 
particles are sub-micrometre grains, but also 
reveals the hierarchical nature of dust particles. 

This hierarchical structure, although not 
detected by remote-sensing studies or analyses 
of the Stardust samples, had been hypothesized 
by some researchers. For example, models of 
the upper layers of cometary surfaces pro- 
vided the most realistic results when these 
layers were assumed to consist of hierarchically 
structured dust particles*. Another analysis’ 
found that hierarchical growth is necessary to 
reproduce the dust-size distribution that pro- 
vides the best fit to characteristics of observed 
protoplanetary nebulae. 

In addition to showing that cometary dust 


particles have a hierarchical structure, Bentley 
et al. find that the basic building blocks of the 
particles are roughly spheroidal — the shape 
of a deformed (elongated or flattened) sphere. 
By approximating the grains as spheroids, the 
authors find that their large axis is, on average, 
2.87 times longer than the small axis. These 
findings are strikingly similar to a model pro- 
posed by astrophysicist Mayo Greenberg” in 
the 1980s. In Greenberg’s model, cometary 
dust particles are aggregates of interstellar dust 
particles, which are described as spheroids 
with an axis ratio of 3 to 1. 

The authors’ results enhance our fundamen- 
tal understanding of cometary dust, and the 
processes that ultimately gave rise to planetary 
systems such as the Solar System. Their dis- 
covery ofa hierarchical structure in cometary 
dust particles and their description of the 
basic building blocks of such particles might 
lead physicists to reconsider the interpretation 
of data obtained from ground-based obser- 
vations of comets and re-evaluate the pro- 
cesses in protoplanetary nebulae — and will 


STRUCTURAL BIOLOGY 


probably give rise to new models ofhow planets 
were formed. = 
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WO ON 


Moulding the ribosome 


Production of the cell’s translational apparatus, the ribosome, requires the 
orchestrated function of hundreds of proteins. A structure of its earliest 
precursor yields unprecedented insight into ribosome formation. 


MARLENE OEFFINGER 


complex called the ribosome is responsi- 
ble for translating messenger RNA into 
amino-acid chains in the cytoplasm. A mature 
ribosome contains about 80 ribosomal pro- 
teins (r-proteins) and four ribosomal RNAs 
(tRNAs). Yet the construction of a ribosome 


I: every living cell, a large macromolecular 
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is mediated by many more proteins and RNA 
molecules within large dynamic pre-riboso- 
mal complexes. Writing in Cell, Kornprobst et 
al.' report that they have exploited advances 
in cryo-electron microscopy to resolve the 
structure of the earliest pre-ribosome, the 90S, 
to a near-atomic resolution of between 4 and 
7 angstroms. The structure reveals, for the first 
time and in stunning detail, the arrangement of 


ESA/ROSETTA/MPS FOR COSIMA TEAM/LANGEVIN ET AL. (2016) 


and interactions between many proteins that 
have been implicated in ribosome assembly, 
shedding light on a crucial step in early 
ribosome formation. 

In 1967, it was discovered’ that, in eukary- 
otic organisms (those whose cells carry a 
nucleus), a long RNA transcript called the 
pre-rRNA undergoes processing in a nuclear 
compartment, the nucleolus, to produce three 
of the four rRNAs found in the mature ribo- 
some. An analysis’ later that year of ribosomes 
isolated from human nuclei, and a compari- 
son’ of cytoplasmic and nuclear ribosomes in 
1972, revealed that nuclear ribosomes contain 
many more proteins than do their cytoplas- 
mic counterparts. These extra proteins were 
hypothesized to help process the pre-rRNA. 

Since then, the steps of pre-rRNA processing 
have been established and most of the extra 
proteins (now called ribosome biogenesis fac- 
tors) have been identified, thanks to advances 
in biochemistry and mass spectrometry. Dur- 
ing its transcription, the long pre-rRNA is 
assembled with r-proteins, ribosome biogen- 
esis factors and small nucleolar RNAs to form 
alarge 90S pre-ribosome. Following the first 
stage of pre-rRNA processing, the complex 
splits into two pre-ribosomes, dubbed pre-40S 
and pre-608, which are eventually exported to 
the cytoplasm where they undergo further 
maturation steps and then join as 40S and 
60S subunits to form the mature ribosome. 

Along with the identities of the biogenesis 
factors came the realization that they num- 
bered a vast 200 to 300 in eukaryotes®”. In the 
yeast Saccharomyces cerevisiae, the 90S pre- 
ribosome alone contains about 70 ribosome 
biogenesis factors — almost as many as the 
number of proteins in a mature ribosome’. 
Hence, a recurring question in the field is: why 
does ribosome production require so many 
accessory proteins? 

By resolving the structure of the 90S pre- 
ribosome in the yeast Chaetomium thermo- 
philum, Kornprobst et al. provide an answer to 
this question. The authors identified features 
in their structure by fitting data from previous 
biochemical and genetic studies (including 
X-ray structures of several proteins, predicted 
protein-domain structures and known pro- 
tein-protein and protein-pre-rRNA inter- 
actions) to determine where different proteins 
and RNAs are located in the 90S complex. 
The requirement for so many extra proteins 
is explained by the authors’ observation that 
many accessory proteins are arranged around 
the folded pre-rRNA molecule in previously 
defined® multi-protein complexes called 
UTP-A, UTP-B and UTP-C. Of these, UTP-A 
and UTP-B form a scaffold, within which the 
newly transcribed pre-rRNA is encased and 
so can be securely processed, modified and 
assembled with r-proteins (Fig. 1). 

The role of this scaffold is reminiscent of 
the way in which chaperone proteins aid fold- 
ing of other proteins — a common process 
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Figure 1 | Maturation of the 90S pre-ribosome. Ribosome assembly involves processing of a long RNA 
transcript called the pre-rRNA by a complex known as the 90S pre-ribosome. Kornprobst et al.’ resolved 
the structure of the 90S in near-atomic resolution. a, A pre-rRNA is generated. b, The authors show that 
the pre-rRNA is threaded into a mould formed by the protein complexes UTP-A and UTP-B, and the 
RNA-protein complex U3. Encased within this mould, the pre-rRNA is safely folded and processed, with 
a sequence called the 5’ external transcribed spacer (5’-ETS) being cleaved away (hidden from view). 

c, After processing, pre-40S and pre-60S complexes, which will go on to form the ribosome, separate from 
the mould components of the 90S. d, The UTP and U3 complexes are presumably recycled for use with 
the next pre-RNA, whereas the excised 5’-ETS is degraded. 


that prevents aggregation of proteins into 
non-functional structures. But although 
chaperone-mediated protein folding has 
been long established’, the idea of chaperone 
moulds is new to RNA biology. 

The 90S chaperone mould also includes 
the small nucleolar ribonucleoprotein com- 
plex U3 — an RNA-protein complex that has 
known roles in pre-rRNA processing and fold- 
ing’’"’. Kornprobst et al. showed that one half 
of U3 spans the outer body of the 90S com- 
plex in a scaffold-like arrangement, whereas 
the other half is buried deep within the 90S, 
presumably interacting with the pre-rRNA. 
This part of U3 is associated with a region at 
the end of the pre-rRNA called the 5’ external 
transcribed spacer (5’-ETS), and the authors 
demonstrated that cleavage of this spacer from 
the pre-rRNA is crucial for the separation of 
the processed 90S pre-rRNAs into pre-40S 
and pre-60S complexes, and the progression 
of ribosome production. 

Kornprobst and colleagues also identi- 
fied the position of the pre-18S rRNA (which 
will become the rRNA component of the 40S 
subunit) in their structure. When comparing 
the pre-18S structure with that of the mature 
18S rRNA, the authors observed that the mol- 
ecule underwent progressive folding, begin- 
ning in the domains closest to the site where 
transcription began. In the 908, these regions 
were folded to resemble the mature 18S, 
whereas domains farther from the transcrip- 
tional start site were seemingly still in transitory 
states. This observation fits well with a previous 
model’ of hierarchical rRNA assembly. 

Kornprobst and colleagues have visual- 
ized in detail what, until now, has been seen 


through electron microscopy only as small 
black balls on strings of pre-rRNA. Holding a 
magnifying glass to the early steps of ribosome 
biogenesis, the authors have finally revealed a 
role for the multitude of ribosome biogenesis 
factors as a chaperone mould that provides a 
secure environment for the processing and 
folding of pre-rRNA. 

The 90S pre-ribosome contains the 
entire rRNA precursor, which includes 
several transcribed spacer sequences that will 
be cleaved away, and sequences that 
will give rise to the rRNAs of the 60S ribo- 
somal subunit. However, Kornprobst et al. 
focused on only the rRNA region and 
the proteins that give rise to the 40S sub- 
unit. As such, many questions about 60S 
formation remain unanswered — for instance, 
whether a separate chaperone-like mould 
encases these other regions of the pre-rRNA. 

There are several structures visible in 90S 
that have not yet been identified. In years to 
come, it will be interesting to index these fea- 
tures and further unravel the role of the UTP-C 
complex and other proteins in 90S pre-rRNA 
maturation. Using the technical advances high- 
lighted in the current study, we can hope to shed 
more light on the dynamic and multi-tiered 
process that is ribosome formation. m 
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Southern Ocean 
freshened by sea ice 


The Southern Ocean has become less salty during the past few decades. An 
analysis of sea-ice transport in the ocean suggests that this phenomenon can be 
explained by coupled changes in sea-ice drift and thickness. SEE LETTER P.89 


TED MAKSYM 


he vast band of water that encircles 

the Antarctic continent, known as the 

Southern Ocean, is the world’s domi- 
nant ocean sink for heat and carbon dioxide’. 
It also has a crucial role in the global overturn- 
ing circulation — the sinking, at high latitudes, 
of cold, dense surface waters to the deep ocean, 
and the compensatory rising of deep waters 
originating from lower latitudes. The Southern 
Ocean's salinity has fallen during the past half- 
century, in the surface and intermediate waters 
of the open ocean” and coastal regions’, and 
in deeper waters’. This freshening of surface 
waters has increased stratification (the vertical 
gradient of water density), potentially inhib- 
iting upwelling of deeper water and affecting 
CO, uptake®. On page 89, Haumann et al.’ 


show that the freshening can be explained by 
changes in Antarctic sea-ice production and 
transport (Fig. 1). 

Previous explanations for this freshening 
have included a net increase in the difference 
between the amount of precipitation and the 
amount of evaporation over the ocean’, and 
increased input of glacial meltwater*’. How- 
ever, the former is inadequate for explaining 
the freshening in surface and intermediate 
waters of the open ocean, and the latter over- 
estimates the freshening of deep waters. The 
constant movement of sea ice redistrib- 
utes a substantial amount of fresh water’, so 
Haumann and colleagues chose to investigate 
the potential contribution of sea-ice transport 
to this observed freshening. 

When sea ice forms, most of the salt is lost to 
the upper ocean, so the ocean loses fresh water 
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and its salinity increases. When the ice melts, 
the fresh water is returned to the ocean. The 
net impact on the upper ocean would be mini- 
mal were it not for prevailing winds that tend 
to push the ice from coastal waters, where most 
of it is formed, to the north, where it melts. 
This drives a net transport of fresh water that 
contributes to the overturning circulation of 
the Southern Ocean. The saltier water that 
results from ice formation in coastal regions 
contributes to the generation of Antarctic Bot- 
tom Water, and the fresh meltwater input to 
the north mixes with upwelling deep water to 
modify the upper waters of the open Southern 
Ocean’. 

Determining any trends in this sea-ice- 
driven freshwater transport is challenging, in 
part because of a lack of reliable data. The vol- 
ume of sea ice transported can be calculated as 
the product of ice concentration (the fractional 
area of the ocean covered by ice), ice thickness 
and ice drift rate. However, satellite-derived 
ice-drift rates have significant biases relative 
to those measured by drifting buoys, and 
potential biases due to changes in data sources 
and satellite sensors over time. And there is no 
long-term data set for ice thickness. 

Haumann et al. addressed these challenges 
by carefully reconstructing time series for each 
of these variables for the period from 1982 
to 2008. First, they established a consistent 
satellite ice-drift time series by remov- 
ing inconsistent data associated with the 


Figure 1 | Sea ice in the Southern Ocean. Haumann et al.’ report that changes in Antarctic sea-ice drift have altered the salinity of the Southern Ocean. 
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transitions between satellites, and stitched 
together different time periods by correcting 
for estimated biases. They then scaled the sat- 
ellite ice-drift series to make it consistent with 
observed buoy drift. 

To reconstruct a time series of ice thickness, 
the authors turned to a model-based estimate 
of ice-thickness trends constrained by observa- 
tions of ice concentration. They then adjusted 
for potential biases in the modelled thicknesses 
using both sparse in situ data’® and ice-thick- 
ness estimates from satellite data’. The time 
series for ice drift and thickness allowed Hau- 
mann ef al. to make more-robust estimates of 
freshwater transport than were previously pos- 
sible. This, in turn, allowed them to estimate 
the impact of transport trends on the salinity 
of the Southern Ocean using a simple model of 
water-mass exchange between the surface and 
the deeper waters. 

The researchers show that the net transport 
of sea-ice-driven fresh water is substantial: 
larger than the inputs from glacial melt and 
comparable to the net input of precipita- 
tion and evaporation’. The estimated tem- 
poral trends are also sizeable: there is a 20% 
increase in transport over the 26-year study 
period. Notably, however, there is consid- 
erable regional variability in freshwater 
transport trends, including a large increase 
in the Pacific sector of the Southern Ocean 
(which encompasses the Ross Sea, where posi- 
tive trends in northward ice drift and extent 
are largest’*). Transport has decreased slightly 
elsewhere. Overall, Haumann et al. estimate 
that sea-ice-driven transport has contributed 
enough fresh water to the open-ocean sur- 
face and intermediate waters to explain the 
observed freshening. 

A compelling result is that the calculated 
trends in sea-ice-driven freshwater transport 
are consistent with other observed patterns 
of change. First, the increases in freshwater 
transport occur in the Pacific sector, where 
increased freshening in surface waters has 
been strongest’. Second, the increase in 
salt input due to sea-ice production in the 
coastal Pacific sector might explain why the 
observed freshening of Antarctic Bottom 
Water is less than that predicted from increased 
glacial melt”. 

It is striking that major changes to ocean 
properties can occur as a result of relatively 
small average changes in sea-ice cover. Sea-ice 
extent has increased only slightly overall dur- 
ing the period covered by the time series, albeit 
with strongly contrasting regional patterns of 
change’’. These regional changes were partly 
wind-driven”, but, as Haumann et al. show, 
there may be little to no trend in the mean drift 
speed of sea ice. This demonstrates that it is 
the coupled trends in regional ice thickness 
and ice drift that are key to driving freshwater 
redistribution. 

An important caveat to the findings is 
that the uncertainty in the derived trends is 
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considerable, and potentially underestimated. 
The corrections for bias in ice drift are large, 
and are difficult to quantify for the earlier 
years, for which there are almost no inde- 
pendent data available to provide validation. 
Nevertheless, the authors’ estimates of fresh- 
water transport remain similar when they 
are based on ice drift estimated from surface 
winds, which are a reasonable proxy for drift. 
The need for better ice-thickness estimates is 
also clear; ice thickness is the largest source of 
uncertainty in the results, and ice-thickness 
trends are the least well constrained by obser- 
vations. However, a recent complementary 
study’ that used a broader array of observa- 
tions collected between 2005 and 2010 to 
constrain a coupled ice-ocean model broadly 
supports the regional patterns of sea-ice- 
driven freshwater transport estimated in the 
current study, allaying concerns about the 
uncertainties. 

Haumann and colleagues’ findings empha- 
size that Antarctic sea ice is not merely a passive 
indicator of climate change and variability, 
but also a driver of changes in the climate sys- 
tem. Through its potential influence on ocean 
stratification and CO, uptake, sea ice might 
have a bigger role than previously thought. 

The implications of these results for the 
Southern Ocean in a warming world are uncer- 
tain, because climate models do not properly 
capture the observed changes in Antarctic 
sea ice’*, However, anticipated future declines 
in ice extent and volume would suggest that 
sea-ice freshwater transport should decrease. If 
so, then future losses of sea ice can be expected 
to play a prominent part in changes in the 
Southern Ocean's overturning circulation. = 
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50 Years Ago 


Savage found that pondweeds in 
the presence of light stimulate 
spawning in Xenopus laevis ... 
This finding prompts me to report 
my own experience with this 
amphibian under more natural 
conditions ... At the Provincial 
Fisheries Institute, Lydenberg, 
fishponds ... are filled with water 
and fertilized with fowl manure 

in spring for the breeding of fish. 
Within 2 or 3 days after fertilization 
such ponds usually contain large 
numbers of Xenopus, which 
immediately start spawning, so 
that by the time plankton has 
developed the pond is teeming with 
larvae ... that they are attracted by 
fertilized water and spawn before 
an algal bloom develops suggests 
that the primary stimulus for 
spawning ... could be the fertilizer. 
From Nature 3 September 1966 


100 Years Ago 


Mr. Beebe has had a wide experience 
of jungle-life in many lands, and 
hence his latest experiences in 
Brazil have the greater value ... 
Abundance of species and a relative 
fewness of individuals, he remarks, 
are pronounced characteristics of 
any tropical fauna ... He quickly 
discovered that more was to be 
obtained by watching particular 
trees ... [D]uring the space ofa 
week of intermittent watching he 
obtained no fewer than seventy-six 
new species ... Just before leaving a 
brilliant idea struck Mr. Beebe ... he 
suddenly bethought him to fill a bag 
with four square feet of jungle earth, 
and this was examined ... while on 
board ship on the voyage home... 
Among the captures thus made were 
representatives of two genera of ants 
new to science. There can be no 
doubt that important discoveries ... 
would accrue if this example of Mr 
Beebe's were generally followed in 
the future. 

From Nature 31 August 1916 
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More is less 


Plants compete for the same resources, such 
as nutrients, light and water. Because these 
resources are often limited, the coexistence 
of plant species requires the creation of 
trade-offs in resource use. In this issue, 
Harpole et al. report that increasing a limited 
nutrient in grassland can eliminate these 
potential trade-offs, reducing overall species 
diversity (W. S. Harpole et a/. Nature 537, 
93-96; 2016). 

The authors considered 45 grassland 
sites across 6 continents, and measured 
species diversity in response to various 
nutrient additions. Their results provide 
strong evidence for a broad ecological 
theory — that the availability of multiple 
limiting resources allows plants with different 


limiting requirements to coexist. 

The greater the number of limiting resources 
that were added, the more species were lost, 
although productivity and turnover improved. 


Suffocation of 
gene expression 


If atumour outgrows its blood supply, oxygen levels in its cells decrease. It 
emerges that this change can alter gene expression by limiting the activity of 
TET enzymes, which remove methyl groups from DNA. SEE ARTICLE P.63 


DAN YE & YUE XIONG 


he addition of methyl groups to the 
| DNA base cytosine leads to decreased 
gene expression, which has broad 
implications for embryonic development and 
tumour suppression’. Such methylation was 
once considered to be irreversible, but in 2009, 
it was found that ten-eleven translocation 
(TET) enzymes could catalyse DNA demeth- 
ylation’. This discovery, fuelled by the finding 
that the gene TET2 is frequently mutated in 
human blood cancers’, sparked intense interest 
in understanding the function and regulation 
of this enzyme family. Thienpont et al.* report 
on page 63 that TET activity is limited by oxy- 
gen supply — revealing a general mechanism 
by which gene expression can be silenced in 
solid tumours. 

TET proteins belong to a dioxygenase 
enzyme family, members of which depend 
on three cofactors for their activity: divalent 
iron (Fe**), the metabolite a-ketoglutarate 
(aKG) and oxygen’. Fe** in the active site of 
the enzyme is coordinated by aKG to split 
an oxygen molecule into two oxygen 
atoms. One oxygen atom attacks and 
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Figure 1 | Reducing TET activity through 
hypoxia. The addition of a methyl group (CH;) to 
the DNA base cytosine to form 5-methylcytosine 
(5mC) can lead to silencing of many genes, 


including those that suppress tumour development. 


TET enzymes, acting with the cofactor molecules 
a-ketoglutarate (aKG) and oxygen, can trigger 

the demethylation of 5mC. In the first step of this 
reaction, O, is split into two atoms. One atom 
breaks a carbon-carbon bond in aKG, leading 

to succinate production and carbon dioxide 
release. The other oxidizes a carbon-hydrogen 
bond in CH; to form CH,OH, converting 5mC 

to 5-hydroxymethylcytosine (ShmC), eventually 
leading to gene expression. Thienpont et al.’ report 
that a shortage of O, in solid tumours inhibits TET 
activity, leading to DNA hypermethylation. 
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The researchers argue that, by understanding 
the mechanisms by which diversity is lost, 

we might develop strategies for restoring and 
preserving Earth’s biodiversity. Ryan Wilkinson 


breaks a carbon-carbon bond in aKG, 
leading to the conversion of the metabo- 
lite to succinate and the release of car- 
bon dioxide. The other atom oxidizes a 
carbon-hydrogen bond in the enzyme’s sub- 
strate (Fig. 1). In TET-mediated reactions, 
methylated cytosine (5-methylcytosine, 
5mC) is oxidized to 5-hydroxymethylcytosine 
(5hmC), and further oxidization follows, even- 
tually leading to the removal of methyl groups 
and so to gene expression®”. 

In addition to mutations that inactivate 
TET genes, TET enzymes can be inactivated 
in tumours if their cofactors are unavailable. 
For example, the accumulation of aKG com- 
petitors such as the metabolites 2-hydroxy- 
glutarate (2-HG)*”, succinate and fumarate’” 
causes decreased TET activity. The discov- 
ery'”"* that these three metabolites accumulate 
in some tumours has led to the idea that cancer- 
promoting metabolites could have a general 
role in contributing to tumour development 
by altering the DNA-methylation landscape in 
cells, in much the same way that DNA damage 
causes cancer by altering the genomic 
landscape. Only a few types of cancer involve 
mutations in TET genes or show accumulation 
of aKG-competing metabolites. But the activity 
of TET enzymes — measured by the production 
of 5hmC — seems to be substantially decreased 
in a wide range of tumours”. This discrepancy 
has remained unexplained until now. 

Solid tumours are oxygenated through 
blood vessels, but a tumour can rapidly 
outgrow its blood supply, leaving oxygen 
concentrations low in some regions. 
Thienpont et al. found that growing human 
or mouse cancer cells in such hypoxic con- 
ditions decreased 5hmC levels in some, 
but not all, of the cancer types they exam- 
ined. Upregulation of TET gene expression 
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could explain the cases in which no decrease 
was seen. 

Why do 5hmC levels decrease in most 
cancer-cell types in hypoxic conditions? 
Damaging molecules called reactive oxygen 
species, which could impair TET activity by 
reducing the amount of Fe", and metabolites 
that inhibit aKG such as 2-HG are known*”° 
to be increased by hypoxia. High levels of 
these molecules could therefore impair TET 
activity. However, the authors excluded both 
as the cause of TET inhibition — supplements 
of vitamin C, which counteracts reactive 
oxygen species, or of aKG could not prevent 
5hmC loss. Instead, analysis of enzyme kinet- 
ics predicted a 45% decrease of TET] activity 
and a 52% decrease of TET2 activity in typical 
hypoxic tumour cells in mice. This is the first 
evidence that oxygen molecules are a rate- 
limiting factor for TET2 activity in tumours. 

Cytosine methylation typically occurs at 
CpG dinucleotide sites, where cytosine and 
guanine bases are found side by side. The 
authors analysed CpG methylation in a few 
tumours. CpG sites in most genes displayed 
increased 5mC levels that were concomitant 
with reduced 5hmC levels following hypoxia, 
suggesting a causal link between hypoxia and 
DNA hypermethylation. 

To test this link further, Thienpont et al. 
turned to previously established gene- 
expression patterns known to be a signature 
of hypoxia'®, to assign tumours to hypoxic, 
normal or intermediate groups. The authors 
separately clustered the tumours into those 
that showed low, intermediate and high CpG 
methylation states. Hypoxic tumours pre- 
dominated in the hypermethylated cluster, 
whereas normoxic tumours were enriched in 
the low-methylation cluster, providing further 
evidence that hypoxia leads to increased CpG 
methylation in tumours. 

Thienpont and colleagues next found that 
hypoxia-linked 5hmC loss and concurrent 
5mC gain were most apparent in promoter 
regions that drive gene expression — including 
the promoters of genes involved in DNA repair, 
the cell cycle, blood-vessel formation and can- 
cer spread. Finally, the authors induced global 
loss of 5hmC in vivo by inducing hypoxia, and 
reversed this effect by deleting one copy of the 
oxygen-sensor gene Phd2, reduced function of 
which is known to restore tumour oxygena- 
tion’. Collectively, these results suggest that 
hypoxia causes TET inhibition, a reduction 
in 5hmC levels and DNA hypermethylation, 
leading to altered gene expression. 

Oxygen shortage is unlikely to be the only 
factor that contributes to the widespread loss 
of ShmC in tumours. 5hmC is a dynamic and 
transient modification that could be affected 
by changes in TET gene transcription, by post- 
translational modifications of TET proteins 
or even by the dynamics of DNA methyla- 
tion. Thienpont and colleagues’ findings also 
raise the question of whether hypoxia could 


impair the activity of other dioxygenases 
that are dependent on Fe”* and aKG, includ- 
ing those involved in DNA repair and in 
the demethylation of DNA-associated histone 
proteins. 

This study also has clinical implications. 
Many conditions, from heart failure to 
stroke, can cause lasting oxygen shortage. 
Is TET activity impaired in these settings in 
ways that alter gene expression, contribut- 
ing to disease progression? TET activity is 
frequently lost in solid tumours, but TET 
genes are rarely mutated. Could restoring 
TET activity in hypoxic tumours, for example 
by increasing levels of vitamin C, aKG or oxy- 
gen, reactivate tumour-suppressor genes that 
have been silenced by hypoxia-induced CpG 
hypermethylation? 

In human tumours, drugs that inhibit blood- 
vessel formation have only incremental and 
variable benefits’’, probably in part because 
hypoxia contributes to tumour progression 
and treatment resistance. In some patients, 
paradoxically, this treatment leads to increased 
tumour oxygenation and is associated with 
longer survival. Perhaps making an informed 
selection of patients on the basis of each indi- 
vidual’s TET activity and tumour methylation 
status could produce therapeutic benefits. 
Thienpont and colleagues’ study reveals a new 
perspective from which to further investigate 
the regulation of TET and other dioxygenases 
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that are dependent on Fe** and aKG in the 
development, and possibly therapeutic 
intervention, of hypoxia-related diseases. m 
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Muddy messages about 
American migration 


When and by which paths did early humans migrate into America? An analysis 
of ancient plant and animal remains revises the timeframe during which a route 
may have opened between ice sheets in northwest America. SEE ARTICLE P.45 


SUZANNE MCGOWAN 


owards the end of the most recent ice 

age, northwest America was covered 

by two immense ice sheets. Where the 
ice sheets split, there was an ice-free corridor, 
which was, for decades, considered to be the 
most probable route for the late-ice-age migra- 
tion of the first humans into the Americas from 
Siberia. The corridor was some 1,500 kilo- 
metres long, so any path between the ice sheets 
would have had to develop into a viable habitat 
to enable humans to journey through it. On 
page 45, Pedersen et al.’ report an analysis of 
ancient DNA, pollen and plant remains in lake- 
sediment samples from British Columbia and 
Alberta, Canada, at locations corresponding to 
stretches of the corridor. This research provides 


the most complete picture yet of the timing 
and pattern of plant and animal development 
ina central ‘bottleneck’ region of the ice-free 
corridor (thought to be one of the last places in 
the corridor to become habitable). 

It was long thought’ that human colonizers 
from Siberia travelled across the exposed 
Bering land bridge and then through the ice- 
free corridor in western Canada, because it 
was the only ice-free route to the continental 
interior. However, subsequent evidence has 
led to a major re-evaluation’. It is now widely 
accepted that the earliest humans were pre- 
sent in South America by around 14,700 years 
ago”? , and that coalescence of the ice sheets 
blocked the ice-free corridor from 23,000 years 
ago until at least 15,000-14,000 years ago’. 
Discoveries of the earliest remains in South 
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America have cast doubt on whether humans 
could have traversed the continent in a win- 
dow of, at most, 300 years. Many now favour 
an alternative Pacific-migration hypothesis to 
explain how early humans reached the Ameri- 
cas. This proposes that the earliest humans 
colonized the continent along the Pacific 
coastline, either by travelling along the ice-free 
land at coastal margins exposed by the lower 
sea levels, or by sea travel’. 

Nevertheless, the corridor remains a key 
potential route for early migrations, par- 
ticularly of the Clovis people who occupied 
North America from around 13,400 to 
12,800 years ago, and who are associated with 
mammoth hunting using distinctive fluted 
tools”**. A key requirement for assessing these 
different route hypotheses is understanding 
not only when physical opening of the ice sheet 
occurred, but also when the flora and fauna 
in the corridor could have supported humans 
travelling northwards or southwards”. 

Pedersen et al. analysed deposits of up to 
12,900 years old, in remnant basins of a large 
glacial meltwater lake that existed in one of 
the last corridor areas to lose ice cover. They 
used microscopy and radiocarbon dating to 
analyse pollen and plant remains from layers 
of lake mud, and also analysed ancient DNA”, 
to establish a timeline of biological changes in 
plants and animals. Ancient-DNA sequenc- 
ing allows researchers to identify the histori- 
cal presence of species such as mammals, birds 
and microorganisms whose remains are not 
visibly identifiable in sediments. 

The authors chose to analyse the ancient 
DNA using ‘shotgun’ DNA sequencing, 
which sequences random DNA fragments in 
an unbiased way. This approach requires fewer 
assumptions to be made about which spe- 
cies’ DNA is present than the more-specific, 
sequence-targeted ‘barcode’ method, which 
has been more extensively used in this field so 
far. As ancient-DNA-analysis techniques are 
starting to become more widely used" and 
genetic reference databases are improving, 
such developments are poised to revolution- 
ize studies of ancient and complex biological 
remains in sediments. The ability to determine 
the identity of entire ancient ecological com- 
munities, including the microbes present, has 
great relevance for understanding their fun- 
damental ecology, as well as providing insight 
into the biogeochemical processes responsible 
for the cycling of chemical elements through 
the environment. 

Analysis of pollen remains by Pedersen and 
colleagues suggests the presence of only sparse 
grasses and grass-like plants known as sedge 
in the ice-free corridor before 12,700 years 
ago. However, the authors found that the key 
ecological successional changes occurred 
when the landscape was colonized by grass- 
land vegetation known as steppe (or prairie), 
which included sagebrush, birch and willow. 
By 12,600 years ago, this steppe landscape 


Ice-free corridor viable 
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Figure 1 | Human migration into the Americas. Timeline of key events of human migrations into the 
Americas, including when the Clovis people were present”** and the timing of the earliest known human 
migrations into the Americas”. One proposed migration route into the Americas from Siberia is through 
an ice-free corridor that opened up between two ice sheets in northwest America. The proposed ranges of 
dates estimated by Pedersen et al.' and Heintzman et al.” for biological viability in this ice-free corridor 
have implications for whether the migration of the Clovis people into the Americas could have occurred 


through this pathway. 


supported bison; by 12,400 years ago, small 
mammals such as hares and voles had arrived; 
and mammoths, elk and bald eagles followed 
soon after. The presence of bison and mam- 
moths is important because they are known to 
have been hunted by early Americans’, and the 
presence of a top predator such as the eagle 
indicates a productive food web. The develop- 
ment of coniferous forest around 10,000 years 
ago is also relevant, because such environ- 
ments are thought to have been impassable 
to humans and unsuitable for the large prairie 
mammals that humans hunted. 

The window of opportunity for human 
movements in the ice-free corridor is esti- 
mated by Pedersen and colleagues to have been 
between 12,600 and 10,000 years ago. Intrigu- 
ingly, a recent publication in Proceedings of the 
National Academies of Sciences by Heintzman 
et al.'* provides conflicting dates and suggests 
an earlier period during which the corridor 
was fully habitable (Fig. 1). Heintzman and 
colleagues’ analysed DNA from organelles 
called mitochondria in the bones and teeth 
of ancient bison. The authors tested the DNA 
samples for the genetically distinctive sig- 
natures of bison of ‘northerr and ‘southerr’ 
origin. Their results indicated that the first 
northward movements of southern bison 
occurred 13,400 years ago, and that northern 
bison had moved south by 13,000 years ago. 

However, Pedersen et al. suggest that the 
northern clade of bison could have developed 
before ice-sheet closure 23,000 years ago, and 
traversed the corridor from north to south 
before this time. Resolving this debate might 
require further consideration of whether the 
absence of steppe pollen and ancient DNA in 
the earliest sediments from the corridor region 
constitutes proof of absence of the species of 
interest, because depositional conditions 
in proglacial lake environments are often 
unstable, leading to sediment reworking and 
degradation”. 

The analysis of ancient DNA in the ice-free 
corridor has provided a window onto ancient 
worlds. Pedersen et al. focus on the role of 
the food web in the steppe environment in 
supporting early humans, but, as they 
acknowledge, the ancient-DNA records are 
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incomplete. For example, archaeological 
records from the nearby Charlie Lake Cave 
indicate that fish and waterfowl were key 
dietary components of dwellers close to the 
lake from 12,700 years ago”, but the authors 
found no genetic evidence for these species at 
this time. By contrast, the ancient DNA fills in 
gaps in the pollen record, indicating that after 
12,400 years ago, poplar and willow trees were 
more common than previously estimated’. 
Together, the availability of wood and fresh- 
water resources might have implications for 
travel in this region — the vast glacial lakes 
that covered parts of the corridor might be 
considered more of an opportunity for water 
transport than an impediment to passage. 
The Pacific-migration hypothesis implies 
earlier use of watercraft than has previously 
been assumed’. Further investigation of the 
hypotheses for human migration into the 
Americas will require close integration of stud- 
ies analysing archaeology, genetics and ancient 
environments, which should, in turn, identify 
pathways for developing more-complete inter- 
pretations of sedimentary ancient DNA™. 
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Postglacial viability and colonization 
in North America’s ice-free corridor 


Mikkel W. Pedersen!, Anthony Ruter!, Charles Schweger?, Harvey Friebe?, Richard A. Staff’, Kristian K. Kjeldsen! , 


Marie L. Z. Mendoza!, Alwynne B. Beaudoin®, Cynthia Zutter®, Nicolaj K. Larsen!’, Ben A. Potter®, Rasmus Nielsen 


1,9,10 
’ 


Rebecca A. Rainville!’, Ludovic Orlando!, David J. Meltzer!’, Kurt H. Kjzer! & Eske Willerslev’ 34 


During the Last Glacial Maximum, continental ice sheets isolated Beringia (northeast Siberia and northwest North 
America) from unglaciated North America. By around 15 to 14 thousand calibrated radiocarbon years before present 
(cal. kyr BP), glacial retreat opened an approximately 1,500-km-long corridor between the ice sheets. It remains unclear 
when plants and animals colonized this corridor and it became biologically viable for human migration. We obtained 
radiocarbon dates, pollen, macrofossils and metagenomic DNA from lake sediment cores in a bottleneck portion of the 
corridor. We find evidence of steppe vegetation, bison and mammoth by approximately 12.6 cal. kyr BP, followed by 
open forest, with evidence of moose and elk at about 11.5 cal. kyr Bp, and boreal forest approximately 10 cal. kyr Bp. Our 
findings reveal that the first Americans, whether Clovis or earlier groups in unglaciated North America before 12.6 cal. 
kyr Bp, are unlikely to have travelled by this route into the Americas. However, later groups may have used this north- 


south passageway. 


Understanding the postglacial emergence of an unglaciated and 
biologically viable corridor between the retreating Cordilleran and 
Laurentide ice sheets is a key part of the debate on human colonization 
of the Americas'*. The opening of the ice-free corridor, long 
considered the sole entry route for the first Americans, closely precedes 
the ‘abrupt appearance’ of Clovis, the earliest widespread archaeological 
complex south of the ice sheets at ~13.4 cal. kyr Bp*”. This view has 
been challenged by recent archaeological evidence that suggests people 
were in the Americas by at least 14.7 cal. kyr Bp®’, and possibly several 
millennia earlier’. Whether this earlier presence relates to Clovis groups 
remains debated’. Regardless, as it predates all but the oldest estimates 
for the opening of the ice-free corridor'®!, archaeological attention has 
shifted to the Pacific coast as an alternative early entry route into the 
Americas’, Yet, the possibility of a later entry in Clovis times through 
an interior ice-free corridor remains open?”"?. 

Whether the ice-free corridor could have been used for a Clovis-age 
migration depends on when it became biologically viable. However, 
determining this has proven difficult because radiocarbon and 
luminescence dating of ice retreat yield conflicting estimates for when 
the corridor opened, precluding precise reconstruction of deglaciation 
chronology'®'?-!”, Once the landscape was free of ice and meltwater, 
it was open for occupation by plants and animals, including those 
necessary for human subsistence. On the basis of studies on modern 
glaciers'®, the onset of biological viability could have been brief (for 
example, a few decades) if propagules were available in adjacent 
areas, and assuming they were capable of colonizing what would have 
been a base-rich (high pH) and nitrogen-poor, soil substrate (such as 
nitrogen-fixing plants like Shepherdia canadensis (buffaloberry)). 

Establishment of biota within the corridor region must have varied 
locally depending on the rate and geometry of ice retreat, the extent of 
landscape flooding under meltwater lakes, and the proximity of plant 


and animal taxa and their dispersal mechanisms!!*°. Some areas 
were habitable long before others. Although the corridor’s deglaciation 
history was complex, broadly speaking it first opened from its southern 
and northern ends, leaving a central bottleneck that extended from 
approximately 55 °N to 60 °N!®!3-!521. On the basis of currently 
available geological evidence, this was the last segment to become ice 
free and re-colonized by plants and animals?!3??"7*, 

Although palynological and palaeontological data can be used to 
help study the opening of the corridor region, these are limited in 
several respects. First, not all vegetation, particularly pioneering forbs 
and shrubs, produce pollen and macrofossils with good preservation 
potential that will be detectable in available depositional locales. Hence, 
timing of plants’ appearance may be underestimated. Second, pollen 
can disperse over long distances and have limited taxonomic resolution, 
differential preservation, and variable production rates, all of which can 
bias vegetation reconstruction”. Third, fossil evidence for initial large 
mammal populations that dispersed into the newly opened corridor 
is sparse. The fossil remains suggest the presence of bison, horse and 
mammoth, and probably some camel, muskox and caribou”®?’. Yet, the 
oldest vertebrate remains after the Last Glacial Maximum are no older 
than ~13.5 cal. kyr BP“, and those specimens are found outside the 
bottleneck region!?*5°, These animals would have been the source 
populations to recolonize the newly opened landscape, and thus their 
presence within the bottleneck region can indicate when the corridor 
became a viable passageway over its entirety. 


Samples and analytical approaches 

To overcome current limitations of the palaeoecological record, and 
develop a more precise chronology for the opening and biological 
viability of corridor’s bottleneck region, we collected nine lake sediment 
cores from Charlie Lake and Spring Lake in the Peace River drainage 
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Figure 1 | Setting and study area. During the Last Glacial Maximum, the 
Laurentide Ice Sheet and the Cordilleran Ice Sheet coalesced in western 
mid-Canada creating a physical barrier to north-south migration. Following 
the Last Glacial Maximum, the ice retreated creating an ice-free corridor 
(IFC). a, Ice extent!” during two periods, Last Glacial Maximum 21.4 cal. kyr 
BP (off-white) and Late Pleistocene 14.1 cal. kyr Bp (light-blue). 


basin (Fig. 1). These are remnants of Glacial Lake Peace, which formed 
as the Laurentide Ice Sheet began to retreat in this region around 15 to 
13.5 cal. kyr Bp and blocked eastward draining rivers!°'?-'>?!(Extended 
Data Fig. 1). Glacial Lake Peace flooded the gap between the ice fronts 
until about 13 cal. kyr Bp, sometime after which Charlie and Spring 
lakes became isolated!°. Thus, this area was amongst the last segments 
of the corridor to open and is pivotal to understanding its history as a 
biogeographic passageway ?!> 14162224, 

Of the nine cores obtained from Charlie Lake and Spring Lake, one 
from each lake predates the Pleistocene to Holocene transition, the 
oldest dating to ~12.9 cal. kyr Bp (modelled age). We sampled the 
cores from both lakes for magnetic susceptibility, pollen*°!, micro- 
and macrofossils, including '*C-dateable material for subsequent 
robust Bayesian age-depth modelling (Fig. 2, Methods, Extended Data 
Figs 2-4 and Supplementary Information). In addition, we obtained 
environmental DNA (eDNA)”, representing molecular fossils of 
local organisms derived from somatic tissues, urine and faeces*?, but 
rarely pollen*4, eDNA complements traditional pollen and macrofossil 
studies*°, and is especially useful for establishing the likelihood that 
a taxon occurred within a particular time period***”. Furthermore, 
eDNA enables identification of taxa even in the absence of micro- and 
macrofossil material, thus improving the resolution of taxonomic 
richness surveys*®. However, amplification of short and taxonom- 
ically informative DNA metabarcodes*® can be biased towards taxa 
targeting**. We used shotgun sequencing of the full metagenome in the 
DNA extracts to reveal the whole diversity of taxonomic groups present 
in the sediment*? (Fig. 2, Methods, Extended Data Figs 5 and 6 and 
Supplementary Information). We confirmed the sequences identified 
as ancient by quantifying DNA damage*”, and found the DNA damage 
levels to accumulate with age (Pearson correlation coefficient = 0.663, 
P value = 0.00012) (Methods and Extended Data Fig. 7a, b). 


Biological succession within the corridor bottleneck 

The basal deposit in the Charlie Lake core is proglacial gravel, 
previously reported from the area’, above which are laminated 
lacustrine sediments, principally composed of silt-sized grains”* 
(Extended Data Fig. 2). We interpret these as deposits from Glacial Lake 
Peace Stage IV (ref. 13), the >15,000 km? proglacial lake that covered 
the Peace River area of northeastern British Columbia and northwest- 
ern Alberta. A subsequent lithological change from silt to sandy organic 
rich mud (gyttja) at the onset of Holocene, around 11.6 cal. kyr Bp, 
reflects a change in sediment source and lake productivity we interpret 
as Charlie Lake becoming isolated from Glacial Lake Peace (Fig. 1). 
This is followed by a decrease in pollen influx in both lake records at 
~11.5 cal. kyr Bp that coincides with an increase in pre-Quaternary 
palynomorphs. At Charlie Lake there is then a marked increase in 
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b, Topography of the Peace River basin with Glacial Lake Peace Phase III 
(white lines with blue outlines) and Phase IV"? with ice extent’® (light-blue 
and dark-blue) at around 14.1 cal. kyr Bp and 13 cal. kyr Bp, respectively. 
The red and white lines mark topographic transects of the lakes which in 
relation to the four phases of Glacial Lake Peace! is found in Extended 
Data Fig. 1. 


pollen influx at ~11.3 cal. kyr Bp. We interpret these fluctuations as 
responses of a highly dynamic landscape to paraglacial and aeolian 
redepositional processes. 

Our palynological and eDNA-based taxonomic identifications, 
respectively, reveal the development of biota in the regional and local 
environment surrounding each lake (Fig. 2, Extended Data Figs 3-6). 
Prior to ~12.6 cal. kyr Bp (Charlie Lake, pollen zone I, ~13 to 12.6 cal. 
kyr Bp), the bottleneck area appears to have been largely unvegetated, 
receiving low pollen influx (<50 grains cm~? y') with little organic 
content (incoherent/coherent ratio) and low DNA concentrations 
(<5 ng per g of sediment). During the later phases of Glacial Lake 
Peace, both pollen and eDNA indicate grasses and sedges were 
early colonizers. Charlie Lake pollen zone II (~12.6 to 11.6 cal. kyr 
BP) contains evidence of steppe vegetation, including Artemisia 
(sagebrush), Asteraceae (sunflower family), Ranunculaceae (buttercup 
family), Rosaceae (rose family, rosids in eDNA), Betula (birch), and 
Salix (willow). A similar plant community is recorded at Spring 
Lake (pollen zone 1), with substantial abundances of Populus and 
S. canadensis, probably due to elevation differences and because by 
this time Spring Lake was no longer part of the Glacial Lake Peace 
system. 

eDNA indicates the steppe vegetation supported a variety of animals 
including Bison which appear at ~12.5 cal. kyr Bp, and Microtus (vole) 
and Lepus (jackrabbit) by ~12.4 cal. kyr Bp (Fig. 3). After 12.4 cal. kyr Bp, 
Populus trees became more dominant and Cervus (elk), Haliaeetus 
(bald eagle) and Alces (moose) appear in the eDNA record. The 
productivity of the bottleneck increased to a peak at ~11.6 cal. kyr Bp. 
The presence of Esox (pike), a top aquatic predator, implies that by 
~11.7 cal. kyr Bp, a fish community was already established. After 
11.6 cal. kyr Bp, Picea (spruce), Pinus (pine) and Betula pollen increased 
in the Charlie Lake pollen record, reflecting the establishment of boreal 
forest. 

Around 11.5 cal. kyr Bp, a distinct decline occurred in pollen influx 
at both lakes. High abundance of Botryococcus (green algae) in each 
is probably a response to changing nutrient sources, lake chemistry, 
sediment input and possibly reduced turbidity following isolation of 
these basins from Glacial Lake Peace*!. Botryococcus dominated the 
early Holocene sequence in Spring Lake (11.7-11.5 cal. kyr Bp) but 
declined relative to Pediastrum (green algae) after 11.0 cal. kyr Bp, 
consistent with eutrophication in a more productive ecosystem. Pollen 
and plant macrofossils indicate Alnus (alder) was in the vicinity of 
Spring Lake at about 7.0 cal. kyr Bp, although it is not evident in eDNA 
until approximately 5.5 cal. kyr Bp. 

We used non-metric multi-dimensional scaling (NMDS) based on 
Bray-Curtis similarity measures to explore whether the eDNA plant 
communities, excluding algae, reflect the pollen data (Fig. 2b, d). 
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Figure 2 | Selected pollen, DNA and biometrical results. a, c, Pollen are 
presented as influx (area) and DNA taxa presented with normalized counts 
(bars). HS asteraceae, high spike asteraceae. Metazoans are presented 

with bullet points indicating their presence. The 5 point average (5p) 

of the incoherent/coherent (incoh/coh) ratio is derived from the X-ray 
fluorescence results and an increasing ratio represents increased organic 


In eDNA samples, the first NMDS axis matches the clear separation 
between major pollen zones at Spring Lake and Charlie Lake. The only 
exception is represented by the 12.2 cal. kyr Bp sample at Charlie Lake, 
which does not cluster with other samples of similar age (~12.6-11.6 
cal. kyr Bp) but is closer to the arboreal and younger samples from 
pollen zone I. Nevertheless, consistency between the main pollen zones 
and clustering of eDNA samples confirms that large ecological changes 
found in pollen records can be identified using eDNA. 

Despite good conformity between palynological and eDNA 
data, some discrepancies suggest these proxies are variably affected 
by a plant’s reproductive process and taphonomic history (see 
Supplementary Information). The most notable of these discrep- 
ancies is the Populus record. In Charlie Lake, its pollen and eDNA 
signals are congruent from ~11.6-11.2 cal. kyr Bp, whereas earlier 
(~12.4-12.1 cal. kyr Bp) the eDNA signal for Populus is more 
pronounced. In Spring Lake, Populus pollen only occurs towards the 
base of the record and in upper zone III, whereas in the eDNA record 
it is abundant throughout. This discrepancy is probably due to Populus 
reproducing vegetatively, and its notoriously low detection rates and 
poor pollen preservation, which often render it palynologically ‘silent’. 
The eDNA reveals that poplar was probably more abundant in the 
regional vegetation than has previously been shown with palynology. 
This has important implications for human occupation as poplar would 


(grains cm~ y~1) 


content. b, d, Non-metric multi-dimensional scaling plots; grey ellipses 
marked I, II, and II encircle the samples corresponding to the respective 
CONIIC pollen zonation. Coloured dots indicate each taxon identified. 
The coloured categories are identical to the pollen and DNA taxa in 
Charlie Lake (a), and Spring Lake (c). 


have provided wood for fuel, shelter, and tools, as well as browse feeding 
for animals. 

The differences between the pollen and eDNA evidence for plants 
might also reflect dispersal factors. Wind-dispersed pollen is more 
likely to be encountered in lake-based pollen records, whereas 
predominantly insect-pollinated taxa are less likely to settle in lake 
sediments and be detected. Many willows (Salix spp.), for example, 
are insect pollinated. Their pollen is present in low percentage (5%) in 
zone II in Charlie Lake, but in higher abundance in zones II and III in 
the eDNA record (Extended Data Figs 5 and 6). This suggests the eDNA 
comes more from macrofossils and plant debris than from pollen. 

The eDNA record also detects taxa not present in fossil bone 
assemblages, including terrestrial and aquatic vertebrates. In 
particular, it identifies top-level aquatic (Esox) and avian (Haliaeetus) 
predators, which indicate a rich supporting community at lower 
trophic levels. Cervus is evident in the Charlie Lake record at about 
11.5 cal. kyr Bp, whereas its earliest fossil remains from the area date to about 
10.2 cal. kyr sp**, Small mammals, such as Microtus are documented 
in the Charlie Lake eDNA at 12.4 cal. kyr Bp confirming the Microtus 
colony found just west of Charlie Lake, at Bear Flats*’. Yet, there are 
also notable absences in eDNA compared to the vertebrate record. For 
example, faunal remains from the adjacent Charlie Lake Cave, dated to 
~12.4 cal. kyr Bp* are rich in waterfowl and other birds and fish not 
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Figure 3 | Ecological interpretation and implications of this study. 
Timeline of the biology in the bottleneck area linking it with evidence of 
human occupation and the first appearance of Clovis technology (see also 
Fig. 4). Grey animal silhouettes are vertebrate genera that were identified 
by environmental DNA in both lake cores. 


detected by eDNA. In the Spring Lake eDNA record, Castor (beaver) 
appears between 5.4 and 3 cal. kyr Bp, whereas evidence from Wood 
Bog* ~60km to the south suggests that the beaver was part of the local 
fauna since at least 11 cal. kyr Bp. 

When the evidence from these multiple proxies is combined, it 
provides a more robust record of the presence of plants and animals 
than any single indicator. It is, of course, possible that some taxa arrived 
on the landscape earlier and escaped detection, thus appearing absent. 
However, there was only a narrow window of time between when the 
bottleneck region was beneath the waters of Glacial Lake Peace and 


70°N 


140 


.. 
r Bi 


 >14 ky 1S 


Yo 


impassable, and when these proxies first detect the presence of plants 
and animals. The eDNA data are particularly important for indicating 
the earliest occurrence of terrestrial fauna in the bottleneck region, 
particularly the game animals that would have been key subsistence 
resources for hunter-gatherers*®. 


Discussion 

Although ice sheet retreat led to the corridor physically opening in 
the bottleneck region starting around 15-14 cal. kyr Bp’”, deglaciation 
was followed by regional inundation below the waters of Glacial Lake 
Peace for perhaps up to 2,000 years!°. By around 12.6 cal. kyr Bp the ice 
sheets were several hundred kilometres apart and the landscape had 
become vegetated. Large and small animals came in soon thereafter, 
around 12.5 cal. kyr Bp, making the corridor capable of supplying the 
biotic resources, including high-ranked prey such as bison, required 
by human foragers for the 1,500 km traverse*’”. This result is consistent 
with the recent finding that the oldest of the southern bison clade 
specimens (clades 1a and 2b) found north of the bottleneck region 
postdates 12.5 cal. kyr Bp, though not with the finding that it opened 
earlier’ (see Supplementary Information). 

From our findings, it follows that an ice-free corridor was unavailable 
to those groups who appear to have arrived in the Americas south of the 
continental ice sheets by 14.7 cal. kyr Bp®’, and also opened too late to 
have served as an entry route for the ancestors of Clovis who were pres- 
ent by 13.4cal. kyr sp’. Not surprisingly, the earliest archaeological 
presence in the Peace River region, at Charlie Lake Cave (Fig. 3) and 
Saskatoon Mountain**’, postdates 12.6 cal. kyr Bp. More striking, once 
opened, the corridor was not used just for southbound movement: 
archaeological evidence suggests that people were moving north as well, 
potentially renewing contact between groups that had been separated 
for millennia’. Bison? were also colonizing the corridor and moving 
north and south; it is uncertain whether other species, such elk? and 
brown bears“’, were moving similarly. 

More broadly, although Clovis people may yet be shown to represent 
an independent migration separate from the peoples present here by 
14,700 cal. kyr Bp, they must have descended from a population that 
entered the Americas via a different route than the ice-free corridor. 
This conclusion is relevant to the recent finding”? that ancestral Native 
Americans diverged into southern and northern branches ~13 cal. kyr BP 
(95% confidence interval of 14.5-11.5 cal. kyr Bp). This implies that 
if that split occurred north of the ice sheets, there must have been 
two pulses of migration to the south. As the Anzick infant’s genome, 
dated to 12.6 cal. kyr Bp and associated with Clovis artefacts, is part 
of the southern branch*®, its ancestors must have travelled via the 
coast. However, this does not preclude the possibility that ancestors 
of the northern branch left Alaska later, through a then-viable ice- 
free corridor. Alternatively, if the divergence occurred in unglaciated 
North America, as recently proposed”, it implies a single ancestral 
population came via the coast. It further raises the possibility that the 
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Model 1: single-wave Model 2: single-wave colonization 
colonization along the IFC 


Not supported by our data Consistent with our data 


along coast (later movement into the IFC) 
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Figure 4 | Colonization models. Comparison of models of Paleoindian colonization (number of pulses, timing, and route(s)) that are supported 


or rejected by our data. All ages are in calibrated years before present. 
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northern branch—the descendants occupying Alaska today—made 
their way north to Alaska via the corridor after 12.6 cal. kyr Bp. Further 
investigations of ancient DNA may help resolve this issue. 


Online Content Methods, along with any additional Extended Data display items and 
Source Data, are available in the online version of the paper; references unique to 
these sections appear only in the online paper. 
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METHODS 


Data reporting. No statistical methods were used to predetermine sample size. The 
experiments were not randomized. The investigators were not blinded to allocation 
during experiments and outcome assessment. 
Sediment sampling. We obtained 23 sediment cores from 8 different lakes by 
using a percussion corer deployed from the frozen lake surface°!. To prevent 
eventual internal mixing, we discarded all upper suspended sediments and 
only kept the compacted sediment for further investigation. Cores were cut 
into smaller sections to allow transport and storage. All cores were taken to 
laboratories at the University of Calgary and were stored cold at 5°C until 
subsequent subsampling. Cores were split using an adjustable tile saw, cutting 
only the PVC pipe. The split half was taken into a positive pressure laboratory 
for DNA subsampling. DNA samples were taken wearing full body suit, mask 
and sterile gloves; the top 10 mm were removed using two sterile scalpels and 
samples were taken with a 5 ml sterile disposable syringe (3-4. cm”) and trans- 
ferred to a 15 ml sterile spin tube. Caution was taken not to cross-contaminate 
between layers or to sample sediments in contact with the inner side of the 
PVC pipe. Samples were taken every centimetre in the lowest 1 m of the core 
(except for Spring Lake, the lowest 2 m), then intervals of 2cm higher up, and 
finally samples were taken every 5cm, and subsequently frozen until analysed. 
Pollen samples were taken immediately next to the DNA samples, while 
macrofossil samples were cut from the remaining layer in 1 cm or 2cm slices. 
Following sampling, the second intact core halves were visually described and 
wrapped for transport. All cores were stored at 5°C before, during and after 
shipment to the University of Copenhagen (Denmark). 
Core logging and scanning. An ITRAX core scanner was used to take high- 
resolution images and to measure magnetic susceptibility at the Department of 
Geoscience, Aarhus University. Magnetic susceptibility°* was measured every 
0.5cm using a Bartington Instruments MS2 system (Extended Data Fig. 2). 
Pollen and macrofossil extraction and identification. Pollen was extracted 
using a standard protocol*®. Lycopodium markers were added to determine pollen 
concentrations® (see Supplementary Information). Samples were mounted in 
(2000 cs) silicone oil and pollen including spores were counted using a Leica 
Laborlux-S microscope at 400 magnification and identified using keys*’** as 
well as reference collections of North American and Arctic pollen housed at the 
University of Alberta and the Danish Natural History Museum, respectively. Pollen 
and pteridophyte spores were identified at least to family level and, more typically, 
to genera. Green algae coenobia of Pediastrum boryanum and Botryococcus were 
recorded to track changes in lake trophic status. Pollen influx values were calculated 
using pollen concentrations divided by the deposition rate (see Supplementary 
Information). Microfossil diagrams were produced and analysed using PSIMPOLL 
4.10 (ref. 31). The sequences were zoned with CONIIC*, with a stratigraphy 
constrained clustering technique using the information statistic as a distance 
measure. All macrofossils were retrieved using a 100,1m mesh size and were 
identified but not quantified. 
Radiocarbon dating and age-depth modelling. Plant macrofossils identified as 
terrestrial taxa (or unidentifiable macrofossils with terrestrial characteristics where 
no preferable material could be identified) were selected for radiocarbon (!4C) 
dating of the lacustrine sediment. All macrofossils were subjected to a stand- 
ard acid-base-acid (ABA) chemical pre-treatment at the Oxford Radiocarbon 
Accelerator Unit (ORAU), following a standard protocol*°, with appropriate 
‘known age’ (that is, independently dendrochronologically-dated tree-ring) stand- 
ards run alongside the unknown age plant macrofossil samples”*. Specifically, this 
ABA chemical pre-treatment (ORAU laboratory pre-treatment code “VV’) involved 
successive 1 M HCl (20 min, 80°C), 0.2 M NaOH (20 min, 80°C) and 1M HCl 
(1h, 80°C) washes, with each stage followed by rinsing to neutrality (>3 times) 
with ultrapure MilliQ deionised water. The three principal stages of this process 
(successive ABA washes) are similar across most radiocarbon laboratories and 
are, respectively, intended to remove: (i) sedimentary- and other carbonate con- 
taminants; (ii) organic (principally humic- and fulvic-) acid contaminants; and 
(iii) any dissolved atmospheric CO, that might have been absorbed during the 
preceding base wash. Thus, any potential secondary carbon contamination was 
removed, leaving the samples pure for combustion and graphitisation. Accelerator 
mass spectrometry (AMS) '4C dating was subsequently performed on the 2.5 MV 
HVEE tandem AMS system at ORAU™. As is standard practice, measurements 
were corrected for natural isotopic fractionation by normalizing the data to a stand- 
ard §'9C value of —25%o VPDB, before reporting as conventional '*C ages before 
present (Bp, before ap 1950)°*. 

These C data were calibrated with the IntCal13 calibration curve°’ and 
modelled using the Bayesian statistical software OxCal v. 4.2 (ref. 60). Poisson 
process (‘P_Sequence’) deposition models were applied to each of the Charlie and 


Spring Lake sediment profiles®!, with objective ‘Outlier’ analysis applied to each of 
the constituent !4C determinations”. The P_Sequence model takes into account 
the complexity (randomness) of the underlying sedimentation process, and thus 
provides realistic age-depth models for the sediment profiles on the calibrated 
radiocarbon (IntCal) timescale. The rigidity of the P_Sequence (the regularity of 
the sedimentation rate) is determined iteratively within OxCal through a model 
averaging approach, based upon the likelihood (calibrated C) data included 
within the model®. A prior ‘Outlier’ probability of 5% was applied to each of 
the '4C determinations, because there was no reason, a priori, to believe that any 
samples were more likely to be statistical outliers than others. All C determina- 
tions are provided in Extended Data Table 1; OxCal model coding is provided in 
the Supplementary Information; and plots of the age-depth models derived for 
Spring and Charlie Lakes are given in Extended Data Fig. 2. 

DNA analysis. All DNA extractions and pre-PCR analyses were performed in 
the ancient DNA facilities of the Centre for GeoGenetics, Copenhagen. Total 
genomic DNA was extracted using a modified version of an organic extraction 
protocol®, We used a lysis buffer containing 68 mM N-lauroylsarcosine sodium 
salt, 50 mM Tris-HCl (pH 8.0), 150 mM NaCl, and 20mM EDTA (pH 8.0) and, 
immediately before extraction, 1.5 ml 2-mercaptoethanol and 1.0ml 1M DTT were 
added for each 30 ml lysis buffer. Approximately 2 g of sediment was added, and 
3 ml of buffer, together with 170 1g of proteinase K, and vortexed vigorously for 
2x 20s using a FastPrep-24 at speed 4.0ms_!. An additional 170 1g of proteinase 
K was added to each sample and incubated, gently rotating overnight at 37 °C. For 
removal of inhibitors we used the MOBIO (MO BIO Laboratories, Carlsbad, CA) 
C2 and C3 buffers following the manufacturer's protocol. The extracts were further 
purified using phenol-chloroform and concentrated using 30 kDa Amicon Ultra-4 
centrifugal filters as described in the Andersen extraction protocol®. Our extrac- 
tion method was changed from this protocol with the following modifications: 
no lysis matrix was added due to the minerogenic nature of the samples and the 
two phenol, one chloroform step was altered, thus both phenol:chloroform:- 
supernatant were added simultaneously in the respective ratio 1:0.5:1, followed 
by gentle rotation at room temperature for 10 min and spun for 5 min at 3,200g. 
For dark-coloured extracts, this phenol:chloroform step was repeated. All 
extracts were quantified using Quant-iT dsDNA HS assay kit (Invitrogen) on a 
Qubit 2.0 Fluorometer according to the manufacturer’s manual. The measured 
concentrations were used to calculate the total ng DNA extracted per g of sediment 
(Fig. 2). 32 samples were prepared for shotgun metagenome sequencing” using 
the NEBNext DNA Library Prep Master Mix Set for 454 (New England BioLabs) 
following the manufacturer’s protocol with the following modifications: 
(i) all reaction volumes (except for the end repair step) were decreased to 
half the size as in the protocol, and (ii) all purification steps were performed 
using the MinElute PCR Purification kit (Qiagen). Metagenome libraries 
were amplified using AmpliTaq Gold (Applied Biosystems), given 14-20 
cycles following and quantified using the 2100 BioAnalyser chip (Agilent). All 
libraries were purified using Agencourt AMPure XP beads (BeckmanCoulter), 
quantified on the 2100 BioAnalyzer and pooled equimolarly. All pooled libraries 
were sequenced on an Illumina HiSeq 2500 platform and treated as single-end 
reads. 

Bioinformatics. Metagenomic reads were demultiplexed and trimmed using 
AdapterRemoval 1.5 (ref. 65) with a minimum base quality of 30 and minimum 
length of 30 bp®. All reads with poly-A/T tails > 4 were removed from each 
sample. Low-quality reads and duplicates were removed using String Graph 
Assembler (SGA) setting the preprocessing tool dust-threshold = 1, index 
algorithm = ‘ropebwt and using the SGA filter tool to remove exact and contained 
duplicates. Each quality-controlled (QC) read was thereafter allowed equal change 
to map to reference sequences using Bowtie2 version 2.2.4 (ref. 68) (end-to-end 
alignment and mode -k 50 for example, reads were allowed a total of 500 hits 
before being parsed). A few reads with more than 500 matches were confirmed 
by checking that the best blast hit belonged to this taxon, and that alternative hits 
have lower e-values and alignment scores. We used the full nucleotide database 
(nt) from GenBank (accessed 4 March 2015), which due to size and downstream 
handling was divided into 9 consecutive equally sized databases and indexed using 
Bowtie2-build. All QC checked fastq files were aligned end-to-end using Bowtie2 
default settings. Each alignment was merged using SAMtools™, sorted according 
to read identifier and imported to MEGAN v. 10.5 (ref. 70). We performed a lowest 
common ancestor (LCA) analysis using the built-in algorithm in MEGAN and 
computed the taxonomic assignments employing the embedded NCBI taxonomic 
tree (March 2015 version) on reads having 100% matches to a reference sequence. 
We call this pipeline “Holi because it takes a holistic approach because it has no a 
priori assumption of environment and the read is given an equal chance 
to align against the nt database containing the vast majority of organismal 
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sequences (see Supplementary Information). In silico testing of ‘Holi’ sensitivity 
(see Supplementary Information) revealed 0.1% as a reliable minimum threshold 
for Viridiplantae taxa. For metazoan reads, which were found to be under- 
represented in our data, we set this threshold to 3 unique reads in one sample 
or 3 unique reads in three different samples from the same lake. In addition, we 
confirmed that each read within the metazoans by checking that the best blast hit 
belonged to this taxon, and that alternative hits have lower e-values and alignment 
scores’!. We merged all sequences from all blanks and subtracted this from the 
total data set (instead of pairing for each extract and library build), using lowest 
taxonomic end nodes. Candidate detection was performed by decreasing the 
detection threshold in ‘Holi’ from 0.1% to 0.01% to increase the detection of 
contaminating plants, and similar for metazoans, we decreased the detection 
level and subtracted all with 2 or more reads per taxa (see Supplementary 
Information). We performed a series of in silico tests to measure the sensitivity and 
specificity of our assignment method and to estimate likelihood of false-positives 
(see Supplementary Information). 

We generated 1,030,354,587 Illumina reads distributed across 32 sediment 

samples and used the dedicated computational pipeline (‘Holi’) for handling read 
de-multiplexing, adaptor trimming, control quality, duplicate and low-complexity 
read removal (see Supplementary Information). The 257,890,573 reads parsing 
filters were further aligned against the whole non-redundant nucleotide (nt) 
sequence database”. Hereafter, we used a lowest common ancestor approach” 
to recover taxonomic information from the 985,818 aligning reads. Plants 
represented by less than 0.1% of the total reads assigned were discarded to limit 
false positives resulting from database mis-annotations, PCR and sequencing errors 
(see Supplementary Information). Given the low number of reads assigned to 
multicellular, eukaryotic organisms (metazoans), we set a minimal threshold of 
3 counts per sample or 1 count in each of three samples. For plants and metazoans 
this resulted in 511,504 and 2,596 reads assigned at the family or genus levels, 
respectively. The read counts were then normalized for generating plant and 
metazoan taxonomic profiles (Extended Data Figs 5 and 6). Taxonomic profiles 
for reads assigned to bacteria, archaea, fungi and alveolata were also produced (see 
Supplementary Information). 
DNA damage and authenticity. We estimated the DNA damage levels using the 
MapDamage package 2.0 (ref. 40) for the most abundant organisms (Extended Data 
Fig. 7b). These represent distinctive sources, which help to account for potential 
differences between damage accumulated from source to deposition or during 
deposition. Input SAM files were generated for each sample using Bowtie2 (ref. 68) 
to align all QC reads from each sample against each reference genome. All aligning 
sequences were converted to BAM format, sorted and parsed through MapDamage 
by running the statistical estimation using only the 5’-ends (-forward) for single 
reads. All frequencies of cytosine to thymine mutations per position from the 5’ 
ends were parsed and the standard deviation was calculated to generate DNA 
damage models for each lake (Extended Data Fig. 7a and Supplementary 
Information). 
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Extended Data Figure 1 | Topographic transects. The red and white lines on Fig. 1b mark topographic transects of Charlie Lake and Spring Lake 
in relation to the four phases of Glacial Lake Peace'’. CIC, Cordilleran ice complex; m.a.s.l., metres above sea level. 
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Extended Data Figure 2 | Visual and physical descriptions and age- 
depth model for the studied lake sediments. a, b, Charlie Lake (a) and 
Spring Lake (b) span the Pleistocene to Holocene transition (dotted 

grey line); magnetic susceptibility (continuous black line); and compressed 
high-resolution images from the ITRAX core scanner and the sedimentary 
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log are shown. Age-depth models for Charlie Lake (a) and Spring Lake 
(b) were generated with P_Sequence deposition models in OxCal v. 4.2 
using the IntCal13 radiocarbon calibration curve*”***!. The probability 
envelopes represent the 68.2% and 95.4% confidence ranges, respectively 
(see Methods and Supplementary Information). 
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Extended Data Figure 3 | Charlie Lake pollen and macrofossil diagrams. a, Pollen are presented as influx and bullet points indicate taxa with less 
than 2 grains cm year ~!. The diagram was zoned using CONIIC*! with a stratigraphically constrained cluster analysis on the information statistic. 
b, Relative proportions of ecologically important taxa. c, Macrofossils were identified but not enumerated. Bullet points represent presence. 
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Extended Data Figure 4 | Spring Lake pollen and macrofossil diagrams. a, Pollen are presented as influx and bullet points represent taxa with less 
than 50 grains cm? year’. The diagram was zoned using CONIIC*! with a stratigraphically constrained cluster analysis on the information statistic. 
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Extended Data Figure 5 | Charlie Lake DNA diagram. DNA results are presented as normalized counts to allow comparison on the temporal 
scale for each taxon. All are unique sequences with 100% sequence identity to taxa. Histogram width equals the accumulation period. a, Viridiplantae, 
bullet points represent counts less than 50. b, Algae, bullet points represent counts less than 50. c, Metazoans, bullet points represent counts equal to 1. 
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Extended Data Figure 6 | Spring Lake DNA diagram. DNA results are presented as normalized counts to allow comparison on the temporal scale for 
each taxon. All are unique sequences with 100% sequence identity to taxa. Histogram width equals the accumulation period. a, Viridiplantae, bullet 


points represent counts less than 50. b, Algae, bullet points represent counts less than 50. c, Metazoans, bullet points represent counts equal to 1. 
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b 
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Organism Spring Lake Charlie Lake NCBI GI number 
Tree Populus Populus 134093177 
Tree/shrub Picea Artemisia 502524829/733387608 
Aquatic Ceratophyllum Ceratophyllum 148508422 
Mammal Bison Bison 225622211 
Cyanobacteria : Anabaena 414075311 


Extended Data Figure 7 | DNA damage accumulation model. Maximum-likelihood DNA damage rates were estimated from nucleotide 
misincorporation patterns using MapDamage?2.0 (ref. 40). a, Each full circle is the mean of cytosine to thymine mutation frequencies at the 


first position (n > 2 species) with above 500 reads aligned to reference bars that represent + 1 s.d. b, Table of species used for determining the DNA 
damage rates. 
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Extended Data Table 1 | AMS 14C determinations of terrestrial plant macrofossil samples from Charlie and Spring Lakes 


Sample ID 


Charlie Lake 
MWP_30 
MWP_ 31 
MWP. 32 
MWP_20 
MWP_21 
MWP_22 
MWP_09 
MWP_23 
MWP_10 


Spring Lake 
MWP_11 
MWP_35 
MWP_15 
MWP_15 
MWP_16 
MWP_02 
MWP_17 
MWP_18 
MWP_ 03 


Data were calibrated with the IntCal13 calibration curve®? and modelled using the Bayesian statistical software OxCal v. 4.2 (refs. 60, 61, 73). 


AMS Laboratory 
code 


OxA-31690 
OxA-31691 * 
OxA-31692 
OxA-31359 
OxA-31360 * 
OxA-31361 
OxA-32206 
OxA-X-2623-15 {* 
OxA-31358 


OxA-X-2623-11 t 
OxA-31898 
OxA-31587 
OxA-31588 
OxA-31589 
OxA-31586 
OxA-31590 
OxA-31355 
OxA-31354 


Conventional '*c 


Age 
Depth (cm) (yrs BP + 1s) 
unless stated 
(ref. 82) 
16 844 + 24 
36 4447 +39 
56 3052 + 26 
80 5396 + 34 
90 7900 +45 
98 7255 + 40 
114 9230 + 40 
149 1.03763 + 0.00595 
157 10635 + 50 
70 3465 +65 
180.5 7248 +35 
233 10010 +45 
233 10040 + 45 
250 10045 +45 
262 10175445 
273 10105 +45 
278 10080 +55 
288 10105 +50 


d'3C (%c) 


-24.9 
-26.0 
-27.9 
-21.3 
-24.0 
-22.8 
-26.6 
-25.6 
-26.6 


-22.8 
-10.8 
-27.1 
-28.7 
-26.8 
-27.8 
-26.1 
-24.6 
-25.4 


68.2% probability 


range 


782 - 730 
3335 - 3216 
6276 - 6188 
8156 - 8016 

10480 - 10290 


12668 - 12573 


3829 - 3643 
8158 - 8014 
11604 - 11331 
11604 - 11331 
11701 - 11556 
11735 - 11616 
11762 - 11649 
11791 - 11662 
11836 - 11673 


Modeled, calibrated age (cal. BP) 
95.4% probability 


range 


797 - 690 
3353 - 3180 
6289 - 6024 
8171 - 7995 

10506 - 10257 


12715 - 12447 


3901 - 3568 
8170 - 7992 
11707 - 11145 
11707 - 11145 
11746 - 11483 
11825 - 11510 
11905 - 11615 
11925 - 11624 
11978 - 11641 


Posterior / Prior 
Outlier probability 


(%) 


5/5 
(81/5) * 
3/5 
3/5 
(70/5) * 
4/5 
415 


(100 /5) * 
4/5 


4/5 
4/15 
3/5 
3/5 
3/5 
6/5 
3/5 
4/5 
4/5 


Samples that represent ‘very small graphite’ AMS targets (<0.5 mg C), and so should be treated with caution (and hence the ‘OxA-X-’ laboratory code prefix). 
*Three samples from Charlie Lake produced statistically outlying dates (>50% probability) that were excluded from the final age model. 
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The antibody aducanumab reduces ASB 
plaques in Alzheimer’s disease 


Jeff Sevigny!*, Ping Chiao, Thierry Bussiére!*, Paul H. Weinreb!*, Leslie Williams!, Marcel Maier?, Robert Dunstan!, 
Stephen Salloway’, Tianle Chen!, Yan Ling!, John O’Gorman!, Fang Qian!, Mahin Arastu!, Mingwei Li!, Sowmya Chollate!, 
Melanie S. Brennan!, Omar Quintero-Monzon!, Robert H. Scannevin!, H. Moore Arnold!, Thomas Engber!, Kenneth Rhodes!, 
James Ferrero!, Yaming Hang!, Alvydas Mikulskis!, Jan Grimm?, Christoph Hock“, Roger M. Nitsch**s & Alfred Sandrock'! 


Alzheimer’s disease (AD) is characterized by deposition of amyloid-( (AG) plaques and neurofibrillary tangles in the brain, 
accompanied by synaptic dysfunction and neurodegeneration. Antibody-based immunotherapy against AG to trigger 
its clearance or mitigate its neurotoxicity has so far been unsuccessful. Here we report the generation of aducanumab, 
a human monoclonal antibody that selectively targets aggregated AG. In a transgenic mouse model of AD, aducanumab 
is shown to enter the brain, bind parenchymal A@, and reduce soluble and insoluble AG in a dose-dependent manner. 
In patients with prodromal or mild AD, one year of monthly intravenous infusions of aducanumab reduces brain AS ina 
dose- and time-dependent manner. This is accompanied by a slowing of clinical decline measured by Clinical Dementia 
Rating—Sum of Boxes and Mini Mental State Examination scores. The main safety and tolerability findings are amyloid- 
related imaging abnormalities. These results justify further development of aducanumab for the treatment of AD. Should 
the slowing of clinical decline be confirmed in ongoing phase 3 clinical trials, it would provide compelling support for 


the amyloid hypothesis. 


The amyloid hypothesis posits that AS-related toxicity is the primary 
cause of synaptic dysfunction and subsequent neurodegeneration that 
underlies the progression characteristic of AD!. Genetic, neuropathol- 
ogical, and cell biological evidence strongly suggest that targeting AS 
could be beneficial for patients with AD**. So far, attempts at therapeu- 
tically targeting AS have not been successful®’, casting doubt on the 
validity of the amyloid hypothesis. However, the lack of success may 
have been due to the inability of the antibodies to adequately engage 
their target or the proper target in the brain, or selecting the wrong 
patient population. 

We describe the development of an antibody-based immuno- 
therapeutic approach by selecting human B-cell clones triggered by 
neo-epitopes present in pathological AB aggregates. The screening of 
libraries of human memory B cells for reactivity against aggregated AB 
led to molecular cloning, sequencing, and recombinant expression of 
aducanumab (BIIB037), a human monoclonal antibody that selectively 
reacts with AB aggregates, including soluble oligomers and insoluble 
fibrils. In preclinical studies, we show that an analogue of aducanumab 
is capable of crossing the blood-brain barrier, engaging its target, and 
clearing AB from plaque-bearing transgenic mouse brains. These 
results prompted the start of clinical trials®. 

We report here interim results from a double-blind, placebo- 
controlled phase 1b randomized trial (PRIME; Clinical Trials.gov iden- 
tifier NCT01677572) designed to investigate the safety, tolerability, 
pharmacokinetics, and pharmacodynamics of monthly infusions 
of aducanumab in patients with prodromal or mild AD with brain 
A8 pathology confirmed by molecular positron emission tomog- 
raphy (PET) imaging. Together, our data support further develop- 
ment of aducanumab as an AB-removing, disease-modifying therapy 
for AD. 


Removal of brain AG plaques in patients with AD 

In the PRIME study, 165 patients were randomized and treated between 
October 2012 and January 2014 at 33 sites in the United States. Patients 
with a clinical diagnosis of prodromal or mild AD and visually positive 
AB PET scan? were given monthly intravenous infusions of placebo 
or aducanumab at doses of 1, 3, 6 or 1Omgkg™! for one year. Of these 
patients, 125 completed and 40 discontinued treatment, most com- 
monly due to adverse events (20 patients) and withdrawal of consent 
(14 patients): 25% of the placebo group discontinued compared with 
23%, 19%, 17%, and 38% of the 1, 3, 6 and 10 mgkg! aducanumab dose 
groups, respectively (Extended Data Fig. 1). Baseline characteristics, 
including cognitive measures, were generally well-balanced across the 
groups, although the 1 mgkg™! dose group included a higher proportion 
of patients with mild AD, and the aducanumab treatment groups tended 
to have a higher Clinical Dementia Rating—Sum of Boxes (CDR-SB) 
score (Table 1). 

Treatment with aducanumab reduced brain AG plaques as measured 
by florbetapir PET imaging in a dose- and time-dependent fashion 
(Fig. 1, 2a). The mean PET standard uptake value ratio (SUVR) com- 
posite score at baseline was 1.44. After 54 weeks of treatment, this had 
decreased significantly (P < 0.001) in the 3, 6 and 10mgkg™! dose 
groups; whereas change for the placebo group was minimal (Fig. 2a, 
Extended Data Table 1). In the 10 mgkg™! dose group, the SUVR 
composite score was 1.16 after 54 weeks of treatment, a value near the 
purported quantitative cut-point of 1.10 that discriminates between 
positive and negative scans (Fig. 2b)'°. The adjusted mean changes 
in SUVR composite scores in the 6 and 10 mgkg~' groups treated for 
26 weeks were similar in magnitude to the dose group below (3 and 
6mgkg', respectively) treated for 54 weeks (Fig. 2a). Reductions in 
amyloid PET SUVR composite score in aducanumab-treated patients 
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Table 1 | Baseline characteristics 


ARTICLE 


Characteristic 


Placebo (n=40) Imgkg-!(n=31) 3mgkg-!(n=32) 6mgkg-!(n=30) 


Aducanumab 


10mgkg~!(n=32) Total (n=165)* 


Years of age (mean +s.d.) 72.847.2 72.6478 
Female sex (n (%)) 23 (58) 13 (42) 
ApoE ¢4 (n (%)) Carriers 26 (65) 19 (61) 
Non-carriers 14 (35) 12 (39) 
Clinical stage (n (%)) Prodromal 19 (48) 10 (32) 
Mild 21 (53) 21 (68) 
MMSE (mean +s.d.) 24.743.6 23.6+3.3 
Global CDR (n (%)) 0.5 34 (85) 22 (71) 
1 6 (15) 9 (29) 
CDR-SB (mean +s.d.) 2.66+1.50 3.40+1.76 
FCSRT sum of free recall 15.2+8.5 13.2+9.0 
score (mean+s.d.) 
PET SUVR composite score 1.444+0.17 1.44+0.15 
(mean -+s.d.) 
AD medications uset 24 (60) 19 (61) 


(n (%)) 


70.548.2 73.3493 73.7£8.3 72.6481 
7 (53) 15 (50) 15 (47) 83 (50) 
21 (66) 21 (70) 20 (63) 107 (65) 
1 (34) 9 (30) 12 (38) 58 (35) 
4 (44) 12 (40) 13 (41) 68 (41) 
8 (56) 18 (60) 19 (59) 97 (59) 
23.244.2 24.4429 24.8+3.1 24.243.5 
22 (69) 25 (83) 24 (75) 127 (77) 
0(31) 5 (17) 8 (25) 38 (23) 
3.50+2.06 3.324154 3.144171 3.184 1.72 
13.8+8.0 144483 14.6483 143483 
1.46+£0.15 1.43 +£0.20 1.44+£0.19 1.44+£0.17 
28 (88) 20 (67) 17 (53) 108 (65) 


Percentages are rounded to the nearest integer. AD, Alzheimer’s disease; ApoE <4, apolipoprotein E <4 allele; CDR, Clinical Dementia Rating; CDR-SB, Clinical Dementia Rating—Sum of Boxes; FCSRT, 
Free and Cued Selective Reminding Test; MMSE, Mini-Mental State Examination; PET, positron emission tomography; SD, standard deviation; SUVR, standard uptake value ratio. 


*Number of patients dosed. 
{Cholinesterase inhibitors and/or memantine. 


were similar in patients with mild and prodromal AD, and apolipopro- 
tein E (ApoE) 4 carriers and non-carriers (Extended Data Fig. 2a, b). 
Pre-specified regional analyses of SUVR changes demonstrated sta- 
tistically significant dose-dependent reductions in all brain regions, 
except for the pons and sub-cortical white matter, two areas in which 
AB plaques are not expected to accumulate (Extended Data Fig. 3). 


Effect on clinical measures 

Clinical assessments were exploratory as the study was not powered to 
detect clinical change. The test of dose response was the pre-specified 
primary analysis for the clinical assessments. Analysis of change from 
baseline on the CDR-SB (adjusted for baseline CDR-SB and ApoE ¢4 
status) demonstrated dose-dependent slowing of clinical progression 
with aducanumab treatment at one year (dose-response, P< 0.05), with 
the greatest slowing for 10mgkg~! (P< 0.05 versus placebo) (Fig. 3a, 
Extended Data Table 1). Sensitivity analysis using a mixed model for 
repeated measures (MMRM) also showed a trend for slowing of decline 
on the CDR-SB at one year (P=0.07 with 10 mgkg~! aducanumab 
versus placebo). A dose-dependent slowing of clinical progression 
on the Mini Mental State Examination (MMSE) with aducanumab 
treatment was also observed at one year (dose-response, P < 0.05), 
with the greatest effects at 3 and 10 mgkg~! aducanumab (P< 0.05 
versus placebo) (Fig. 3b, Extended Data Table 1). On sensitivity anal- 
ysis using MMRM, the greatest difference was retained for 10mgkg™! 
aducanumab (P < 0.05 versus placebo), with a smaller difference at 
3mgkg~! (P=0.10 versus placebo). No changes from baseline after 
one year were found on the composite neuropsychological test battery 
(NTB) or the Free and Cued Selective Reminding Test (FCSRT) free 
recall (Extended Data Table 1), but skewed non-normal (floor) effects 
at baseline were observed. The floor effects on the NTB were seen in 
the individual tests; specifically, in the two most clinically relevant com- 
ponents given the stage of the population enrolled: Wechsler Memory 
Scale-Fourth Edition Verbal Paired Associates II (WMS-IV VPA IT) 
and Rey Auditory Verbal Learning Test (RAVLT) delayed recall of the 
NTB memory domain. 


Safety and tolerability 

The most common adverse effects were amyloid-related imaging 
abnormalities (ARIA), headache, urinary tract infection, and upper 
respiratory tract infection (Table 2). Using the most specific descrip- 


tion of ARIA by magnetic resonance imaging (MRI), ARIA-vasogenic 
oedema (ARIA-E) abnormalities occurred in no patients receiving 
placebo compared with 1 (3%), 2 (6%), 11 (37%), and 13 (41%) patients 
receiving 1, 3, 6 and 10 mgkg™! aducanumab, respectively (Extended 
Data Table 2). ARIA-E was generally observed early in the course of 
treatment, MRI findings typically resolved within 4-12 weeks, and of 
the 27 patients who developed ARIA-E, 15 (56%) continued treatment 
(Supplementary Information). All cases of symptomatic ARIA were 


Baseline One year 
2 es Placebo 
@ 6 ° ° 
® (x) _ 
@ @ ” ° 


Figure 1 | Amyloid plaque reduction with aducanumab: example 
amyloid PET images at baseline and week 54. Individuals were chosen 
based on visual impression and SUVR change relative to average one-year 
response for each treatment group (n = 40, 32, 30 and 32, respectively). 
Axial slice shows anatomical regions in posterior brain putatively related 
to AD pathology. SUVR, standard uptake value ratio. 


1 SEPTEMBER 2016 | VOL 537 | NATURE | 51 


© 2016 Macmillan Publishers Limited, part of Springer Nature. All rights reserved. 


ARTICLE 


a Aducanumab (mg kg~') Aducanumab (mg kg~') 
> 4 
> 0.05 4 Placebo 1 3 10 Placebo 1 3 10 
oan (n = 34) (n = 27) (n= 27) (n = 30) (n = 26) (n= 21) 
a (n = 26) (n = 23) (n= 21) (n = 23) 
@ 0.00 
= 
3 
g H-005 
5S€& 
® 2 -0.10 4 
$3 
@ 
6c 
as -0.15 
ES 
© § 9.20 
£ 
B -0.25 
$ 
2030 J Week 26 Week 54 i 
Dose-response P < 0.001 at weeks 26 and 54 based on a linear contrast test 
b Baseline Week 26 Week 54 
1.507 
oo 
G 1424 
itd 
> 
B 1.344 
2: 
8 
Q 1.265 
— 
co} 
6 
& 1184 
o 
= 
110 ecsbo 18 10 Placebo 1 3 6 Placebo 1 3 6 10 
(=34) (0 =29) (n= 28) (n=34 (n=27)— (n=27) (1=30)  (n=26)  (n=21) 
(n=26)  (n=24) (n=26)  (n=23) (n=21)  (n=23) 
_——— ———————— a 
Aducanumab (mg kg") Aducanumab (mg kg") Aducanumab (mg kg~') 
c is —>isd —s<1sd. 
g 0 ¥ ol 
& oa 0 
a 25 = 
1) = 
= 20 £ -1.0 
3 e 
15 £ 20 
oO o 
S 05 Bee 
< c 
o oO 
Ss o4 2 -40 
Baseline Week 26 Week 54 


Baseline Week 24 Week 52 
(n = 39) 
(n= 51) 


(n = 39) 
(n= 51) 


(n = 39) 
(n = 50) 


(n= 38) 
(n = 52) 
Figure 2 | Amyloid plaque reduction with aducanumab. a-c, Change 
from baseline (a, analyses using ANCOVA), SUVR values (b), and 
categorization of change in amyloid PET (c) at week 54 and associated 
change from baseline CDR-SB and MMSE in aducanumab-treated patients 
(post hoc analysis). Categorization of amyloid PET at week 54 based on 
s.d. of change from baseline in placebo-treated patients. **P < 0.01; 

*** P< 0.001 versus placebo; two-sided tests with no adjustments for 
multiple comparisons. Mean + s.e. ANCOVA, analysis of covariance; 
CDR-SB, Clinical Dementia Rating—Sum of Boxes; MMSE, Mini Mental 
State Examination; SUVR, standard uptake value ratio. 


required to be reported as medically important serious adverse effects. 
No patients were hospitalised for ARIA. The only serious adverse effects 
(by preferred term) that occurred in more than one patient in any 
treatment group were ARIA (0, 1 (3%), 1 (3%), 4 (13%), and 5 (16%) 
of patients receiving placebo, and 1, 3, 6 and 10 mgkg™! aducanumab, 
respectively) and superficial siderosis of the central nervous system 
(0, 1 (3%), 0, 2 (7%), and 3 (9%) of patients receiving placebo and 1, 
3, 6 and 10mgkg~! aducanumab, respectively). Owing to the require- 
ment for repeated MRI assessments of those patients who developed 
ARIA, these individuals were partially unblinded to treatment. Other 
adverse effects and serious adverse effects were consistent with the 
patient population. There were no drug-related deaths (Supplementary 
Information). 


Pharmacokinetics 

The pharmacokinetics of aducanumab (maximum concentration 
(Cmax) and cumulative area under the concentration curve (AUC)) were 
linear across the dose range in patients who received all 14 planned 
doses (Extended Data Table 3). The median plasma half-life was 
21 days. In total, 3 of 118 evaluable patients (3%) in the combined 
aducanumab groups developed treatment-emergent anti-aducanumab 
antibodies within the first year of treatment. Antibody responses were 


52 | NATURE | VOL 537 | 1 SEPTEMBER 2016 


a CDR-SB 
254 Week 26 


Week 54 


2.04 


0.5 4 


Adjusted mean change from baseline (+s.e.) 


Placebo 1 3 6 10 
(n = 36) (n = 30) (n = 28) 
(n = 28) (n= 27) 


Placebo 1 3 6 10 
(n= 31) (n = 27) (n = 23) 
(n = 23) (n = 26) 


Aducanumab (mg kg~') Aducanumab (mg kg~') 


Dose-response P < 0.05 at week 54 based on a linear contrast test 


b MMSE 


055 Week 24 Week 52 


Adjusted mean change from baseline (+s.e.) 


Placebo 1 3 6 


Placebo 1 3 6 10 
(n = 32) (n = 26) (n = 25) 
(n = 25) (n = 26) 


(n = 36) (n= 29) 


10 
(n= 29) 
(n= 26) 8) 


(n = 28) 


Aducanumab (mg kg~') Aducanumab (mg kg~') 
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Figure 3 | Aducanumab effect (change from baseline) on CDR-SB 

and MMSE. a, b, Aducanumab effect on CDR-SB (a) and MMSE (b). 

*P < 0.05 versus placebo; two-sided tests with no adjustments for multiple 
comparisons. CDR-SB and MMSE were exploratory endpoints. Adjusted 
mean + s.e. Analyses using ANCOVA. CDR-SB, Clinical Dementia 
Rating—Sum of Boxes; MMSE, Mini Mental State Examination. 


transient, with minimal titres, and had no apparent effect on adu- 
canumab pharmacokinetics or safety. 


Brain penetration and binding to AG plaques 
In the preclinical studies which preceded PRIME, systemically admin- 
istered aducanumab (single dose, 30 mg kg™! intraperitoneally (i.p.)) 
bound to diffuse and compact A8 plaques in the brains of 22-month-old 
female Tg2576 transgenic mice (“Target engagement study’; Extended 
Data Fig. 4a—d). Cmax in plasma was 181 pg ml~!, with a terminal 
elimination half-life (t1/2) of 2.5 days. The Cmax in brain was 1,062 ng g~! 
of tissue, and approximately 400-500 ng g~! of drug was measured 
3 weeks after dosing, suggesting long-term retention. Consequently, the 
brain:plasma AUC ratio of 1.3% was higher than the 0.1% frequently 
reported for systemically administered antibodies'»!”. 
Administration of a single dose of aducanumab did not affect plasma 
(Extended Data Fig. 4b) or brain (data not shown) AG concentrations, 
consistent with the observation that aducanumab does not bind to 
soluble AB monomers. In contrast, the murine bapineuzumab precur- 
sor antibody 3D6, which binds to AB monomers, triggered a transient 
plasma A{ spike (Extended Data Fig. 4b). Similarly, plasma A$ con- 
centrations were stable after repeated dosing with aducanumab in the 
PRIME study (data not shown). Within 24h of dosing, aducanumab 
bound to parenchymal brain A$ with a spatial pattern essentially 
superimposable with ex vivo pan-A{ antibody staining, confirming 
that aducanumab binds all morphological types of brain AB plaques in 
vivo, including diffuse AB deposits and compact AB plaques (Extended 
Data Fig. 4c, d). Aducanumab binding to A6 deposited in cerebral amy- 
loid angiopathy (CAA) lesions within brain blood vessel walls was less 
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Table 2 | Summary of adverse events and most common adverse events 


Aducanumab 
Adverse event (n (%)) Placebo (n=40) Imgkg-!(n=31) 3mgkgt(n=32) 6mgkg!(n=30) 10mgkg-!(n=32) 
Any adverse event 39 (98) 28 (90) 27 (84) 28 (93) 29 (91) 
Serious event 15 (38) 3 (10) 4 (13) 4 (13) 12 (38) 
Discontinuing treatment due to an adverse event 4 (10) 3 (10) 2 (6) 3 (10) 10 (31) 
Common adverse events 
ARIA 2 (5) 2 (6) 4 (13) 11 (37) 15 (47) 
Headache 2 (5) 5 (16) 4 (13) 8 (27) 8 (25) 
Urinary tract infection 4 (10) 3 (10) 2 (6) 4 (13) 5 (16) 
Upper respiratory tract infection 6 (15) 2 (6) 2 (6) 2(7) 6 (19) 
Diarrhoea 3 (8) 0 6 (19) 1(3) 3 (9) 
Arthralgia 2 (5) (e) 6 (19) 2(7) 1(3) 
Fall 8 (20) 3 (10) 2 (6) 2(7) 2 (6) 
Superficial siderosis of CNS 0 2 (6) 1(3) 2(7) 4 (13) 
Constipation 0 3 (10) 1(3) 1(3) 3 (9) 
Nausea 2 (5) 2 (6) 5 (16) 0 1(3) 
Anxiety 4 (10) 4 (13) 1) 1 (3) 1(3) 
Nasopharyngitis 0 1(3) 5 (16) 0 1(3) 
Cough 2 (5) 3 (10) 1(3) 6) 1(3) 
Alanine aminotransferase increased 0 3 (10) 6) 1(3) 0) 
Aspartate aminotransferase increased 0 3 (10) 6) 0 1(3) 


Common adverse events are those with an incidence of >10% in any aducanumab treatment group. Incidence of ARIA based on adverse event reporting. Adverse events of ARIA-E (oedema) and 
ARIA-H (micro-haemorrhage) are both coded to the MedDRA preferred term of amyloid-related imaging abnormalities, and ARIA-H (superficial siderosis) codes to the MedDRA preferred term of 
superficial siderosis of the CNS. ARIA, amyloid-related imaging abnormalities; CNS, central nervous system; MedDRA, Medical Dictionary for Regulatory Activities. 


prominent than parenchymal A binding, when compared with the 
total amount of AB (Extended Data Fig. 4c, d). 


Reduction of brain A in transgenic mice 

Exposure in plasma and brain correlated linearly with dose after chronic 
dosing in plaque-bearing transgenic mice (Extended Data Fig. 5) 
(Supplementary Information). “aducanumab, a murine IgG2a/k, 
chimaeric analogue, dose-dependently reduced AB measured in brain 
homogenates by up to 50% relative to the vehicle control in the dieth- 
ylamine (DEA) fraction that extracted soluble monomeric and oligo- 
meric forms of AB4y and AB4, and in the guanidine hydrochloride 
(GuHCl) fraction that extracted insoluble A® fibrils (Fig. 4a, b). 

Quantitative 6E10 immunohistochemistry showed significant reduc- 
tions in all forms of AG deposits by up to 70% (Fig. 4c, d). Thioflavin S 
(ThioS) staining of compact AB plaques showed dose-dependent and 
statistically significant reductions in the cortex and hippocampus 
by up to 63% (Fig. 4c, d). Quantitative histology indicated that 
cbaducanumab significantly reduced the number of plaques of all sizes, 
including plaques >500 1m? and plaques <125 im? (Extended Data 
Fig. 6a—c). Quantification of ThioS-positive vascular and parenchymal 
AB plaques separately showed that aducanumab did not affect vascular 
AQ in either cortex or hippocampus (Fig. 4e-h). 

To identify the mechanism of A® clearance, we analysed the involve- 
ment of microglia which are known to display enhanced phago- 
cytic activities through binding to the Fc region of an antibody!*"*. 
chaducanumab significantly increased recruitment of Iba-1-positive 
microglia to AB plaques, suggesting Fc\R-mediated phagocytosis of 
antibody-A$ complexes as a possible clearance mechanism (Extended 
Data Fig. 7a—c and Supplementary Information). 


Biochemical characterization 

The apparent affinities of aducanumab and “baducanumab for aggre- 
gated AB49, with half maximal effective concentration (ECs59) values of 
0.1nM, were comparable to 3D6 (ref. 13) (Fig. 5a). Neither aducanumab 
nor “aducanumab bound monomeric soluble AB49 at concentrations 


up to 141M, indicating >10,000-fold selectivity for aggregated AB over 
monomer, whereas 3D6 bound soluble AB4 9 with an ECs9 of 1nM 
(Fig. 5b). In contrast to 3D6, which immunoprecipitated both mono- 
meric and aggregated AQ, “taducanumab bound soluble AB. oligomers 
and insoluble AB.) fibrils prepared in vitro, but not AB42 monomers 
(Fig. 5c). Histological staining of autopsy tissue from patients with AD 
or aged amyloid precursor protein (APP) transgenic mice confirmed 
binding of aducanumab to bona fide human AB fibrils (Fig. 5d, e). 


Discussion 

The PRIME study shows that aducanumab penetrates the brain and 
decreases A® in patients with AD in a time- and dose-dependent 
manner. Within 54 weeks of treatment, 3, 6 and 10 mgkg! doses of 
aducanumab significantly decreased the amyloid PET SUVR. Patients 
receiving placebo showed virtually no change in their mean PET 
SUVR composite scores over one year, indicating that AB pathology 
had already reached an asymptote of accumulation. Considering that it 
may have taken up to 20 years for AB to have accumulated to the levels 
in these patients at study entry’®, the observed kinetics of AB removal 
within a 12-month time period appears encouraging for a disease- 
modifying treatment for patients with AD. 

The cognitive results for CDR-SB and MMSE provide support for the 
clinical hypothesis that reduction of brain A@ confers a clinical benefit. 
Post hoc analysis showed that those aducanumab-treated patients 
who had decreased SUVR scores >1 standard deviation unit relative 
to placebo-treated patients after one year of treatment experienced a 
stabilization of clinical decline on both CDR-SB and MMSE scores; 
whereas, those patients with a smaller or no decrease experienced clin- 
ical decline similar to placebo patients (Fig. 2c). The apparent clinical 
benefit observed in PRIME could also be explained by the binding of 
aducanumab to oligomeric forms of AB, which would not be directly 
detected by PET imaging. The reductions in SUVR scores may be sur- 
rogates for reductions in toxic soluble A8 oligomers which may have 
had a more functionally relevant impact on cognition. Whereas signifi- 
cant AGB reduction was detectable by 6 months, clinical effects were not 
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Figure 4 | Reduction of amyloid burden following weekly dosing with 
chaducanumab in 9.5- to 15.5-month-old Tg2576 transgenic mice. 

a, b, AB4o and Ay levels in soluble DEA (a) and insoluble GuHCl (b) 
brain fractions. c, d, Total brain AB (6E10) and compact amyloid plaques 
(ThioS) in cortex (c) and hippocampus (d) (mean + s.e.; n = 20-55; 
dotted line 50% reduction; *P < 0.05 versus control). e-h, ThioS staining 
of amyloid deposits (e) and Visiopharm software (f) differentiated 
parenchymal deposits (green) from vascular deposits (red) (representative 
pictures 10x magnification), and quantified area of vascular amyloid 

(g, h; mean + s.e.; n = 20-24). 


apparent until one year. Given that clearance of AG could be followed by 
recovery of neuronal function, a lag between reduction of A8 burden 
and slowing of disease progression is not altogether surprising. 

The main safety finding, ARIA-E, was dose-dependent and more 
common in ApoE ¢4 carriers, consistent with findings with other 
anti-AB monoclonal antibodies”'*!”, Although the underlying cause 
of ARIA is not well understood, it is likely that the MRI signal of ARIA 
is due to increased extracellular fluid. This may be a result of underlying 
CAA, changes in perivascular clearance and vascular integrity, or local 
inflammatory processes associated with AG-targeting therapies'”-”° (see 
Supplementary Information for further discussion). 

Study limitations of the PRIME phase 1b study included staggered 
parallel-group design, small sample sizes, limited region (USA only), 
and possible partial unblinding due to ARIA-E. Measures were taken 
to maintain blinding to adverse effects: raters of given tests were not 
permitted to perform other clinical assessments, and were blinded to 
other assessments (for example, MMSE and CDR raters were required 
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Figure 5 | Aducanumab binds selectively to insoluble fibrillar and 
soluble oligomeric AB aggregates. a, Binding of “aducanumab or 3D6 to 
immobilized fibrillar AB4.. Mean +s.d., in triplicate. b, Capture of soluble 
monomeric A(4o with immobilized “aducanumab or 3D6. Mean £s.d., in 
triplicate. c, Dot blots of A842 monomer, soluble oligomers, or insoluble 
fibrils immunoprecipitated with ha ducanumab, 3D6, or irrelevant 
antibody control. Equivalent concentrations confirmed by direct dot 
blotting (Peptide). d, e, Immunostaining of A8 in autopsy brain tissue from 
a patient with AD with “aducanumab (0.2 jug ml~!) (d) and 22-month-old 
Tg2576 transgenic mouse brain tissue with aducanumab (60 ng ml’) (e). 


to be different and neither were permitted to perform other study 
assessments). Post hoc analyses of change from baseline PET SUVR 
composite score and cognition by presence/absence of ARIA suggested 
no apparent difference in treatment effect when comparing patients 
with and without ARIA-E (Extended Data Table 4). There was overlap 
in enrolment in Arms 1-3 (aducanumab 1 and 3mgkg“|, placebo) and 
Arms 4 and 5 (aducanumab 10 mgkg™!, placebo) but Arms 6 and 7 
(aducanumab 6 mgkg~!, placebo) were initiated after enrolment in 
Arms 1-5 was complete. This was a small study designed for assessment 
of safety and tolerability, and for detecting a pharmacological effect on 
brain AG levels measured by PET imaging. The trial was not powered 
for the exploratory clinical endpoints, thus the clinical cognitive results 
should be interpreted with caution. Primary analyses were based on 
observed data with no imputation for missing values, nominal P values 
were presented with no adjustments for multiple comparisons, and they 
were supported by sensitivity analyses using a MMRM. 
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The initiation of the PRIME study and its results are supported 
by extensive preclinical data. Detection on parenchymal A plaques 
following a single systemic administration confirmed that aducanumab 
penetrates the brain to a sufficient extent to allow accumulation on 
AB plaques. This is consistent with earlier findings showing that, in 
the presence of significant A6B deposition, plaque-binding antibod- 
ies can be detected bound to the target over an extended period'*”!. 
The minimal effective dose upon repeated systemic administration 
of aducanumab in transgenic mice was 3mgkg ! (corresponding 
to minimally effective concentrations of 13.8 + 1.9,.g ml! in plasma 
and 99.8 +30.0ng¢g_! in brain) with reductions of AG42 in soluble and 
insoluble brain fractions of approximately 50%, and reductions in AB 
plaque of approximately 40%. Since exposure at 3 mgkg~! in animals 
and humans is approximately equivalent, the observed dose-response 
in the model was consistent with the clinical doses that led to reductions 
in amyloid PET SUVR. “aducanumab cleared plaques of all sizes, sug- 
gesting that aducanumab triggered clearance of pre-existing AB plaques 
and prevented formation of new plaques. 

In transgenic mice, aducanumab preferentially bound to parenchy- 
mal AB over vascular AB deposits, consistent with the lack of effect 
on vascular AB following chronic dosing. The effect of anti-A anti- 
body therapies on the vascular AB compartment could be related to 
micro-haemorrhages or oedema in transgenic mice, and may relate 
to ARIA in clinical trials?*. Nevertheless, the preferential binding of 
aducanumab to parenchymal versus vascular A$ may have been critical 
in allowing the use of relatively high doses in the clinical study so as 
to achieve robust target engagement in the brains of patients with AD. 

Several mechanisms may be involved in aducanumab’s A3-lowering 
activity. The clearance of A@ deposits was accompanied by enhanced 
recruitment of microglia. Together with the reduced potency of 
the aglycosylated form of “aducanumab (data not shown), and the 
ex vivo phagocytosis data, this suggests that Fc-yR-mediated microglial 
recruitment and phagocytosis played an important role in AG clear- 
ance in these models. Activated microglia appeared to encapsulate the 
remaining central dense core of plaques in treated animals, possibly 
isolating them from the surrounding neuropil. It is commonly thought 
that soluble A8 oligomers, rather than monomers or plaques, may be 
the primary toxic species**4, Considering that A$ plaques might 
be a source of A8 oligomers”*-*®, this suggests that treatment with 
aducanumab might slow their release into the neuropil, thereby limiting 
their toxic effect on neurons”. In fact, chronic dosing of 18-month- 
old Tg2576 transgenic mice with “aducanumab led to normalization 
of neuritic calcium overload in the brain*”. Other studies have linked 
calcium dyshomeostasis in neurons and microglia to binding of A® oli- 
gomers to metabotropic receptors’ *°. Aducanumab binding to soluble 
A§ oligomers may prevent their interaction with those receptors, 
thereby preventing the detrimental effect of membrane depolarization. 
Restoration of this functional endpoint suggests that aducanumab 
treatment may lead to beneficial effects on neuronal network function 
underlying cognitive deficits. 

Together, the clinical and preclinical data support continued devel- 
opment of aducanumab as a disease-modifying treatment for AD. The 
clinical study results provide robust support to the biological hypothesis 
that treatment with aducanumab reduces brain A$ plaques and, more 
importantly, to the clinical hypothesis that AB plaque reduction confers 
clinical benefit. This concurs with preclinical data demonstrating brain 
penetration, target engagement, and dose-dependent clearance of Af 
plaques in transgenic mice. The clinical effects of aducanumab need 
to be confirmed in larger studies. Both the long-term extension (LTE) 
phase of this study and phase 3 development are ongoing. 


Online Content Methods, along with any additional Extended Data display items and 
Source Data, are available in the online version of the paper; references unique to 
these sections appear only in the online paper. 
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METHODS 


Clinical study subjects. Patients were screened for inclusion in three stages. 
First, patients were evaluated on demographic, and clinical and laboratory 
criteria, including being between 50-90 years of age, and meeting clinical criteria 
for either prodromal or mild AD, as determined by the investigator. The criteria 
for prodromal AD were: MMSE score between 24-30 (inclusive), a spontaneous 
memory complaint, objective memory loss defined as a free recall score of <27 on 
the FCSRT*4, a global CDR score of 0.5, absence of significant levels of impairment 
in other cognitive domains and essentially preserved activities of daily living, and 
an absence of dementia*’. The criteria for mild AD were: MMSE score between 
20-26 (inclusive), a global CDR of 0.5 or 1.0, and meeting the National Institute 
on Aging-Alzheimer’s Association core clinical criteria for probable AD**. Second, 
patients who remained eligible underwent MRI to exclude those with confounding 
pathology, including acute or sub-acute micro- or macro-haemorrhage, prior macro- 
haemorrhage, >4 micro-haemorrhages, superficial siderosis or any finding that 
might be a contributing cause of the patient’s dementia, pose a risk to the patient, 
or prevent a satisfactory MRI assessment for safety monitoring. Third, remaining 
eligible patients underwent a florbetapir PET scan, and those with a positive scan 
based on a visual assessment, as determined by a qualified reader, were eligible. 
The AG PET screening process has been described in a separate publication’. Stable 
use of most concomitant background medications was permitted and, in the case 
of cholinesterase inhibitors and/or memantine, patients were required to be on a 
stable dose for a minimum of 4 weeks before screening with no adjustment of dos- 
ing during the double-blind phase of the study. Patients were excluded if they had 
a medical condition that might be a contributing cause of cognitive impairment. 
Clinical study design. This was a multicentre, randomized, 12-month, double-blind, 
placebo-controlled, multiple-dose study of aducanumab followed by a 42-month, 
dose-blinded LTE study in patients with either prodromal or mild AD who were 
A® PET-positive (ClinicalTrials.gov identifier NCT01677572). The primary objec- 
tive was to evaluate the safety and tolerability of multiple doses of aducanumab in 
patients with prodromal AD or mild AD dementia. The secondary objectives were 
to: (i) assess the effect on cerebral Af plaque content as measured by '*F-florbetapir 
PET imaging at week 26; (ii) assess the multiple-dose serum concentrations of 
aducanumab; and (iii) evaluate the immunogenicity of aducanumab after multiple- 
dose administration. The key exploratory objectives were assessments of the effect 
of aducanumab on the following: the clinical progression of AD as measured by 
change from baseline on the CDR-SB, a NTB, and the FCSRT; disease-related 
biomarkers in blood, cerebral AB plaque content as measured by !8F-florbetapir 
PET imaging at week 54; and cerebral AB plaque content by ApoE ¢4 carrier status 
(carrier/non-carrier). Other exploratory endpoints were change from baseline on the 
Neuropsychiatric Inventory Questionnaire, Cognitive Drug Research computerized 
test battery, volumetric MRI, and, in a subset of patients, glucose metabolism as 
measured by fluorodeoxyglucose PET, functional connectivity by task-free func- 
tional MRI, cerebral blood flow by arterial spin labelling MRI, and disease-related 
biomarkers in cerebrospinal fluid. MMSE was another exploratory assessment. 

During the 12-month, double-blind, placebo-controlled phase, patients received 
aducanumab or placebo by IV infusion once every 4 weeks for 52 weeks. In a 
staggered, parallel-group design, the treatment arms were enrolled as follows: 
first Arms 1-3 (aducanumab 1 mg kg“! (n= 30); aducanumab 3 mg kg"! (n = 30); 
placebo (n= 20), respectively) in parallel. Once enrolment was open, Arms 4 and 5 
(aducanumab up to 10mg kg! (n= 30) (actual dose 10 mg kg"); placebo (n= 10), 
respectively) were enrolled in parallel with Arms 1-3. Once enrolment in Arms 
1-5 was complete, enrolment in Arms 6 and 7 (aducanumab up to 30 mg kg! 
(n= 30) (actual dose 6 mg kg”); placebo (n= 10), respectively) began. The trial 
was initially designed to dose up to 30 mg kg !, but when ARIA were detected at 
10mg kg’ it was decided not to proceed to doses higher than 10mg kg“! with 
repeated infusions. Dose escalation in Arms 4 and 5, and then Arms 6 and 7, 
was based on review of existing safety, tolerability, and pharmacokinetic data, 
and recommendation of the external Data Monitoring Committee. Patients were 
randomized (using a centralized interactive voice and web response System 
(IXRS)) to a treatment group within Arms 1-3, 4 and 5, or 6 and 7, stratified by 
ApoE ¢4 status (carrier or non-carrier). Patient enrolment was monitored so that 
the ratio of ApoE ¢4 carriers to non-carriers was no more than 2:1 and no less than 
1:2. During the overlap in enrolment of Arms 1-3 and Arms 4 and 5, patients were 
randomized using a minimization algorithm. Patients who discontinued study 
treatment for any reason were encouraged to remain in the study and complete all 
assessments during the double-blind period. Patients completing the double-blind 
period and meeting certain eligibility criteria entered the LTE. After enrolment 
on Arms 6 and 7 were completed, the protocol was amended to include a titration 
arm and a corresponding placebo group—Arms 8 and 9. Both the LTE and Arms 
8 and 9 are ongoing and were not part of this interim analysis. 

Investigators, study site staff (except for a designated pharmacist/technician), 
and study patients were blinded to the patients’ randomized treatment assignment 
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for the placebo-controlled period. Only the designated pharmacist/technician at 
each site was aware of the assigned treatment for each patient. Aducanumab was 
supplied as a sterile clear-to-yellow solution for IV infusion at a dose of 200 mg in 
4ml. For patients randomized to receive aducanumab, undiluted aducanumab 
(required volume based on patient weight) was added to a 100 ml 0.9% saline bag 
to reach the assigned dose (an equivalent amount of saline was first withdrawn 
so that the final total volume of all IV bags was identical). All IV bags (active and 
placebo (100 ml 0.9% saline)) were covered with a sealed brown light-protective 
bag to maintain blinding with a label including protocol and patient randomiza- 
tion number. 

Cases of ARIA were managed in accordance with protocol-defined rules using 
centrally read MRI findings coupled with clinical symptoms, if present. The rules 
were consistent with the guidelines published by the Alzheimer Association 
Research Roundtable Working Group". Briefly, patients developing mild ARIA-E 
or ARIA-H (<4 incident micro-haemorrhages) without clinical symptoms could 
continue at the same dose; patients developing moderate or severe ARIA-E without 
clinical symptoms, or those with ARIA-E accompanied by mild clinical symptoms, 
could suspend treatment and resume at the next lower dose level once ARIA (and 
symptoms, if any) resolved. Patients who developed ARIA-E or ARIA-H (<4 inci- 
dent micro-haemorrhages) accompanied by moderate, severe, or serious clinical 
symptoms, >4 incident micro-haemorrhages, any incident macro-haemorrhage, 
or >1 incident haemosiderosis at any time during the study were to permanently 
discontinue treatment. 

The study was conducted in accordance with the Declaration of Helsinki, and 

the International Conference on Harmonisation and Good Clinical Practice guide- 
lines, and had ethics committee approval at each participating site. All patients 
provided written informed consent. 
Clinical study assessments. Amyloid plaque content, as measured by florbetapir 
PET imaging, was assessed at screening, and at weeks 26 and 54. Detailed PET 
scanning protocols have been described in a separate publication’. Briefly, for each 
florbetapir scan, a dose of 370 MBq was injected intravenously, with PET scanning 
starting around 50 min later and continuing for approximately 20 min. 

Visual reads, the basis for meeting the inclusion criterion of a positive AB PET 
scan, were based upon PET image data, with the registered MRI and fused PET/ 
MRI data providing supplementary anatomical information. Scans were inde- 
pendently interpreted by two board-certified neuroradiologists who, in accordance 
with the Amyvid Prescribing Information”, had successfully completed a training 
programme (provided by the manufacturer using either an in-person tutorial or 
an electronic process). Images were designated as positive or negative, following 
guidelines described in the Amyvid Prescribing Information*”. 

A composite cortical SUVR was computed using a volume-weighted average 
across six brain regions of interest (frontal, parietal, lateral temporal and senso- 
rimotor, anterior, and posterior cingulate cortices), as previously described!®, 
normalized to whole cerebellar activity!**. 

Clinical tests including the CDR and an NTB (comprising RAVLT Immediate 
and Delayed Recall, Wechsler Memory Scale Verbal Pair Associate Learning Test 
Immediate and Delayed Recall, Delis-Kaplan Executive Function System Verbal 
Fluency Conditions 1 and 2, and the Wechsler Adult Intelligence Scale Fourth 
Edition Symbol Search and Coding Subsets) were performed during screening 
and at weeks 26 and 54. The FCSRT was performed at screening and at week 52. 
These clinical tests were administered by a trained, certified clinician or rater 
experienced in the assessment of patients with cognitive deficits. When possible, 
the same rater would administer a given test across all visits. In order to maintain 
blinding to adverse events, raters were not permitted to perform other clinical 
assessments, and were blinded to other clinical and safety assessments. The rater 
who conducted the CDR for a patient could not complete any other rating scales 
for that same patient, and was blinded to the results of all other cognitive scales. 

The following safety assessments were performed at regular intervals: physi- 
cal examination, neurological examination, vital signs, electrocardiogram, and 
laboratory safety assessments. During the placebo-controlled period, brain MRI 
was performed at screening and at weeks 6, 18, 30, 42, and 54, and end of study or 
termination. The MMSE was completed at screening, and at weeks 24, 52, and end 
of study or termination, and, in patients who developed ARIA, at every scheduled 
visit until ARIA resolved. 

The concentrations of aducanumab in serum and presence of anti-aducanumab 
antibodies were determined using validated ELISA techniques (Supplementary 
Information). 

Statistical analysis in the clinical study. This interim analysis included all patients 
randomized to a fixed-dose regimen and completing the double-blind period of the 
study (data cut-off February 2015). For all analyses, all patients assigned to placebo 
were treated as a single group. The safety population was defined as all patients 
who were randomized and received at least one dose of study treatment. Adverse 
events were coded using the Medical Dictionary for Regulatory Activities 
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classification. The pharmacodynamic and pharmacokinetic populations were defined 
as all patients who were randomized, received at least one dose of study treatment, 
and had at least one post-baseline assessment of the pharmacodynamic parameter or 
at least one measurable aducanumab concentration in serum, respectively. 

The primary analysis of the pharmacodynamic and efficacy data was based on 
Analysis of Covariance (ANCOVA), adjusting for baseline and ApoE ¢4 status 
(carrier and non-carrier) using observed data. No imputation was performed for 
missing data. For each time point, adjusted means for each treatment, pairwise 
adjusted differences with placebo, 95% confidence intervals for the pairwise dif- 
ferences, and associated nominal P values for comparison were calculated. No 
adjustments were made for multiple comparisons/multiple interim analyses. Dose- 
response was tested using a linear contrast from the ANCOVA model. The linear 
contrast test is sensitive to a variety of positive dose-response shapes, including a 
linear dose-response relationship. This served as the primary analysis for the cogni- 
tion analyses. To account for missing data, a MMRM was used as a sensitivity anal- 
ysis for the longitudinal data change from baseline data, adjusting for baseline and 
ApoE ¢4 status (carrier and non-carrier). Visit and treatment group were treated 
as categorical variables in the model along with their interactions. An unstruc- 
tured covariance matrix was assumed to model the within-patient variability. 
This model imposes no assumptions on mean trend and correlation structure, 
and is considered robust. 

Subgroup analyses were performed for change from baseline AB PET and 
change from baseline for cognition measures (CDR-SB and MMSE) for baseline 
clinical stage and ApoE €4 status (carrier and non-carrier). The subgroup analysis 
of the pharmacodynamic and efficacy data was based on ANCOVA, adjusting for 
baseline and ApoE ¢4 status (carrier and non-carrier) (for baseline clinical stage 
only) using observed data. 

Serum pharmacokinetics were determined by nonlinear mixed effects model 
(NONMEM) approach. Sparse samples in the multiple-ascending-dose study and 
intensive samples from an earlier single-ascending-dose study® were combined to 
construct a population pharmacokinetic model. The model was built in NONMEM 
software using the first-order conditional estimation with interaction method. 
Cumulative AUC up to month 12 was estimated for each patient. The plasma 
terminal elimination half-life was estimated in the pharmacokinetic analysis pop- 
ulation. The analysis population for the primary immunogenicity analysis was 
defined as all patients who were randomized, received study treatment, and had at 
least one post-dose immunogenicity sample evaluated for immunogenicity. 

Interim analyses were specified in the protocol for the purpose of planning 
future studies; no changes were to be made for this study based on the interim 
analysis results. 

A sample size of 30 patients per treatment group would provide more than 90% 

power to detect a treatment difference of 1 standard deviation with respect to the 
reduction of AG from baseline, based on comparison of each aducanumab group 
with placebo, at a two-sided significance level of 0.05, and assuming a dropout 
rate of 20%. 
Transgenic mouse studies. Penetration of aducanumab into the brain and target 
engagement were assessed in 22-month-old female Tg2576 mice following a single 
dose of aducanumab at 30 mg kg! administered i.p. (‘Target engagement study’; 
n=4-5 per time point). The ability of aducanumab to reduce A8 burden was 
assessed following chronic treatment of 9-month-old male and female Tg2576 trans- 
genic mice dosed weekly i.p. for 6 months with PBS or 0.3, 1, 3, 10, or 30mg kg! 
of the murine chimaeric variant “aducanumab (‘Chronic efficacy study’; n= 20-55 
per treatment group). An additional dosing study (‘Chronic efficacy study with 
Agly’; n= 12-14 per treatment group) comparing the plaque clearing ability of 
chaducanumab to that of an effector function-impaired variant (“aducanumab- 
Agly) was conducted using a similar study design (chronic treatment of 9.5-month-old 
Tg2576 transgenic mice dosed weekly i-p. for 6 months with PBS or 3mg kg"! of 
chaducanumab or Paducanumab-Agly). 

Mice were killed following anaesthesia with ketamine/xylazine (100/10 mg kg"! 
ip.). Blood was collected by cardiac puncture, and mice were perfused with ice-cold 
heparinized saline (0.9%) using a peristaltic pump. The brain was removed and 
halved along the medio-sagittal line. The right hemisphere was frozen on dry ice 
and stored at —80°C for biochemical analysis. The left hemisphere was fixed by 
immersion in 10% neutral buffered formalin. 

Size of the treatment groups was determined to take into account natural mor- 
tality (10-20%) and high inter-animal variability specific to the Tg2576 strain of 
mice. No animals were excluded from the analyses, unless the animal died pre- 
maturely. ‘n’ reported in the manuscript represents the number of animals in each 
group that were euthanized as scheduled at the end of the study. The allocation of 
animals to treatment groups took into account date of birth, gender, and weight 
at baseline. Each treatment group was balanced for mean age, gender, and mean 
weight. Dosing solutions were coded with letters so that all experimenters were 


blinded to the treatment. The labelling of the samples collected did not reflect 
treatment group, so that experimenters processing and analysing the samples 
were still blinded. Codes were broken once all analyses were completed, including 
statistical analysis. 

All in-life procedures were conducted in strict accordance with protocols 

approved by Biogen’s Institutional Animal Care and Use Committee. 
Biochemical measurements. Please see Supplementary Information. 
Histological assessment. Please see Supplementary Information. 
Preparation of different AG peptide conformations. Synthetic AB1-42 (AB42) 
peptide (AnaSpec, Fremont, California, USA) was reconstituted in hexafluoro- 
isopropanol at a concentration of 1 mg/ml, aliquoted, air-dried, and vacuum- 
concentrated to form a film, and dissolved in dimethyl sulfoxide (DMSO) at a 
concentration of 5 mg/ml. AB) oligomers and A{4, fibrils were prepared by diluting 
DMSO-reconstituted monomeric into PBS at a concentration of 100 ,1g/ml and 
incubating at 37°C for at least 3 days and 1 week, respectively. The solution was 
centrifuged at 14,000g for 15 min at 4°C, and oligomers were recovered from the 
supernatant following the shorter incubation, whereas fibrils were recovered from 
the pellet following the longer incubation. For details on the biophysical charac- 
terization of high molecular weight AB,42 aggregates, please see Supplementary 
Information. 

In immunoprecipitation experiments, samples of freshly prepared monomeric, 
soluble oligomeric, or insoluble fibrillar AB42 were immunoprecipitated with 
chaducanumab, 3D6 or a murine IgG2a control antibody (P1.17), dot-blotted onto 
a nitrocellulose membrane, and detected with biotinylated pan-A antibody 6E10. 
Similar results were observed for “aducanumab when immunoblotted with 3D6. 
ELISA. Please see Supplementary Information. 

Antibody generation using reverse translational medicine. Aducanumab was 
derived from a de-identified blood lymphocyte library collected from healthy 
elderly subjects with no signs of cognitive impairment and cognitively impaired 
elderly subjects with unusually slow cognitive decline. Memory B cells, isolated 
from peripheral blood lymphocyte preparations by anti-CD22-mediated sorting 
were cultured on gamma-irradiated human peripheral blood mononuclear cell 
feeder layers. Supernatants from isolated B cells were screened for their ability to 
stain AB plaques on brain tissue sections, from either patients with AD or aged 
APP transgenic mice®, and for their binding to aggregated forms of AG4o and AB42 
in vitro. Positive hits meeting the above criteria were counter-screened to exclude 
clones cross-reacting with full-length APP expressed on stably transfected HEK293 
cells (provided by U. Konietzko, University of Zurich, Switzerland; tested nega- 
tive for mycoplasma contamination; not independently authenticated). Selected 
A®B-reactive B-cell clones were subjected to cDNA cloning of IgG heavy and k or \ 
light chain variable region sequences, and sub-cloned in expression constructs 
using Ig-framework specific primers for human variable heavy and light chain fam- 
ilies in combination with human J-H segment-specific primers. Aducanumab was 
engineered to incorporate glycosylated human IgG] heavy and human k light chain 
constant domain sequences. A murine chimaeric IgG2a/k version of aducanumab 
(“aducanumab) was generated for use in chronic efficacy studies in APP transgenic 
mice. An aglycosylated variant of aducanumab (“aducanumab-Agly), incor- 
porating a single point mutation (N297Q, using standard Kabat EU numbering) 
which eliminates N-glycosylation of the Fc region and severely reduces FcyR 
binding”, was generated to test for Fc-related activities. The recombinant mouse 
IgG2b monoclonal antibody 3D6*' was used as a comparator in some studies. 

Ex vivo phagocytosis assay. Please see Supplementary Information. 
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Extended Data Figure 1 | Participant accounting. PET, positron emission tomography. 
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Extended Data Figure 2 | Amyloid plaque reduction with aducanumab non-carrier) and baseline composite SUVR (a), and for analyses by ApoE 
by baseline clinical stage and baseline ApoE ¢4 status. a, b, Analyses e4 status, using treatment and baseline composite SUVR (b). Adjusted 
by baseline clinical stage were performed using ANCOVA for change mean +s.e. ApoE ¢4, apolipoprotein E ¢4 allele; SUVR, standard uptake 
from baseline with factors of: treatment, ApoE ¢4 status (carrier and value ratio. 
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Extended Data Figure 3 | Amyloid plaque reduction: regional analysis SUVR at week 54. The boxed area indicates the six regions included in the 
composite score. *P < 0.05; **P < 0.01; ***P < 0.001 versus placebo; two-sided tests with no adjustments for multiple comparisons. Adjusted mean +s.e. 
Analyses using ANCOVA. SUVR, standard uptake value ratio. 
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Extended Data Figure 4 | Brain penetration of aducanumab after 

a single intraperitoneal administration in 22-month-old Tg2576 
transgenic mice. a, b, Aducanumab levels in plasma and brain (a), and 
plasma AB levels after a single dose (b; n = 4-5; mean +s.e.). ¢, d, In vivo 
binding of aducanumab to amyloid deposits detected using a human 
IgG-specific secondary antibody (c), and ex vivo immunostaining with a 
pan-AB antibody on consecutive section (d). Examples of a compact AB 
plaque (solid arrow), diffuse AGB deposit (dashed arrow), and CAA lesion 
(dotted arrow). CAA, cerebral amyloid angiopathy. 
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Extended Data Figure 5 | Exposure following weekly dosing with 
chaducanumab in 9.5- to 15.5-month-old Tg2576 transgenic mice. 

a, b, ®aducanumab concentrations in plasma (a), or DEA-soluble brain 
extract (b) were measured in samples collected 24h after the last dose in 
the ‘Chronic efficacy study. Mean + s.e. Dotted lines represent the limits 
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of quantitation of each assay. c, Correlations of drug concentrations in 
plasma (open circles) or brain (open triangles) with administered dose. 
The average brain concentrations in the two groups receiving the lowest 
dose were below the limit of quantitation for that assay, which is indicated 
by a dotted line on the figure. 
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Extended Data Figure 6 | Treatment with “aducanumab affects plaques _ associated with a significant decrease in plaque number in all size ranges 


of all sizes. a, Following weekly dosing of “aducanumab in Tg2576 relative to vehicle-treated controls, with reductions of 58%, 68%, 68%, and 

from 9.5-15.5 months of age, amyloid plaques were stained with 6E10 53% in the number of plaques for the <125,1m?, 125-250 um’, 250-500 um, 

and quantified using Visiopharm software. b, Plaque size was defined by and >500\1m? groups size, respectively. Mean +s.e,; statistically significant 
area, and coloured as follows: <125 1m? (cyan), 125-250 jum? (green), differences from vehicle for each size range are indicated with asterisks; 


250-500 jm? (pink), and >500 jm? (red). c, chaducanumab treatment was *P <0.05, Mann-Whitney test. 
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Extended Data Figure 7 | Enhanced recruitment of microglia 

to amyloid plaques following “aducanumab treatment and 
engagement of Fc receptors. a, b, Brain sections from either PBS- or 
chaducanumab-treated mice (‘Chronic efficacy study’; 3 mg kg~! group) 
were immunostained for AG (6E10; red) and a marker of microglia 
(Iba1; brown). c, The area of individual amyloid plaques was measured, 
and Ibal-stained microglia were grouped into two categories, either 
associated with plaques (within 251m of a plaque) or not associated with 
plaques (>25 1m from a plaque). Plaques with circumferences > 70% 
surrounded by microglia were quantified and stratified based on 
plaque size. The fraction of plaques that were at least 70% surrounded 
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by microglia was significantly greater in the aducanumab-treated 

group (white bars) compared with the PBS control group (grey bars), 

for plaques >250|1m?. Mean +.e,; statistically significant differences 
from vehicle for each size range are indicated with asterisks; *P < 0.05, 
Bonferroni’s post hoc test following one-way analysis of variance. All 
quantifications were done using the Visiopharm software. d, e, FITC- 
labelled A®4 fibrils were incubated with different concentrations of 

the antibodies before adding to BV-2 microglia cell line (d), or primary 
microglia (e) for phagocytosis experiment measuring uptake of AB) fibrils 
into the cells by FACS analysis. Mean + s.d. 
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Extended Data Table 1 | Change from baseline in amyloid PET SUVR values (a secondary endpoint at 6 months), and in exploratory clinical 
endpoints at the end of the placebo-controlled period (6-month data also shown for amyloid PET) 


Aducanumab 
p-value 
Adjusted mean + SE change (dose- 
from baseline for: Placebo 1 mg kg" 3 mg kg" 6 mg kg" 10 mg kg" response) 
Amyloid PET SUVR values 
At 6 months (n=34) (n=26) (n=27) (n=23) (n=27) 
—0.005 + 0.018 —0.030 + 0.020 —0.087 + 0.020" —0.143 + 0.022" —0.205 + 0.020" <0.0001 
At 1 yeart (n=30) (n=21) (n=26) (n=23) (n=21) 
0.003 + 0.021 —0.055 + 0.024 —0.135 + 0.022" —0.210 + 0.024" —0.268 + 0.025" <0.0001 
CDR-SBt (n=31) (n=23) (n=27) (n=26) (n=23) 
1.87+0.41 1.72 + 0.46 1.37 + 0.43 1.114 0.44 0.63 + 0.47° <0.05 
MMSE* (n=32) (n=25) (n=26) (n=26) (n=25) 
—2.81 + 0.67 —2.18 + 0.75 —0.70 + 0.75" —1.96 + 0.75 —0.56 + 0.76" <0.05 
NTB overall Z scoret (n=29) (n=23) (n=26) (n=24) (n=24) 
-0.11 + 0.08 —0.25 + 0.09 —0.13 + 0.08 —0.19 + 0.09 —0.10 + 0.09 NS 
FCSRT: sum of free recall (n=31) (n=23) (n=25) (n=25) (n=25) 
score? 
2.33 + 1.07 1.63 + 1.24 -1.25 + 1.20 4.04 + 1.21 —0.69 + 1.20 NS 


*P<0.05; **P<0.01; ***P<0.001 versus placebo; two-sided tests with no adjustments for multiple comparisons. 

TAt week 54. 

fAt week 52. 

Analyses using ANCOVA. ApoE ¢4, apolipoprotein E <4 allele; CDR-SB, Clinical Dementia Rating—Sum of Boxes; FCSRT, Free and Cued Selective Reminding Test; MMSE, Mini-Mental State Examination; 
NS, not significant; NTB, neuropsychological test battery; SE, standard error; SUVR, standard uptake value ratio. 
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Aducanumab 
Placebo 1 mg kg" 3 mg kg" 6 mg kg" 10 mg kg" 
Number of dosed subjects with at least one post-baseline MRI 38 31 32 30 32 
ApoE €4 carrier 24 19 21 21 20 
ApoE ¢€4 non-carrier 14 12 11 9 A2 
ARIA-E, n (%) 0 1 (3) 2 (6) 11 (37) 13 (41) 
By ApoE 4 
ApoE ¢4 carrier 0) 1 (5) 1 (5) 9 (43) 11 (55) 
ApoE ¢4 non-carrier 0 0 1 (9) 2 (22) 2 (17) 
ARIA-E and: 
Continued treatment 0 0 2 (6) 8 (27) 5 (16) 
Same dose 0 0 0 2(7) 0 
Dose reduced 0 0 2 (6) 6 (20) 5 (16) 
Discontinued treatment 0 1 (3) 0 3 (10) 8 (25) 
ApoE ¢4 carrier 0 1 (5) 0 2 (10) 7 (35) 
ApoE €4 non-carrier 0) 0) 0 1 (11) 1 (8) 
Isolated ARIA-H, n (%) 2 (5) 2 (6) 3 (9) 0 2 (6) 
ApoE ¢4 carrier 2 (8) 1 (5) 2 (10) 0 2 (10) 
ApoE ¢4 non-carrier 0 1 (8) 1 (9) 0 0 
ARIA-E and ARIA-H, n (%) 0 1 (3) 1 (3) 5 (17) 8 (25) 
ApoE ¢4 carrier 0 1 (5) 1 (5) 5 (24) 7 (35) 
ApoE ¢4 non-carrier 0 0 0 0 1 (8) 


ApoE ¢4, apolipoprotein E <4 allele; ARIA, amyloid-related imaging abnormalities; ARIA-E (oedema); ARIA-H (micro-haemorrhages, macro-haemorrhages, or superficial siderosis); 


MRI, magnetic resonance imaging. 
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Extended Data Table 3 | Pharmacokinetic data 


PK analysis population (intent- 
to-treat)’ 


Cumulative AUC (yg.h/mL, 
mean + SD) 


Subjects who received all 14 
planned doses 


Cmax ss (g/mL, mean + SD)* 


Cumulative AUC (yg.h/mL, 
mean + SD) 


1 mg kg" 


n=31 


47,078 + 17,555 


n=18 


21.243.7 


55,223 + 11,529 


3 mg kg" 


n=32 


143,395 + 59,986 


n=18/19t 


59.6 + 19.6 


169,457 + 41,775 


Aducanumab 


6 mg kg" 


n=30 


251,535 + 122,883 


n=16 


123.8 + 42.5 


315,352 + 76,300 


10 mg kg" 


n=32 


346,163 + 198,603 


n=14 


250.8 + 33.5 


524,511 + 95,622 


*Data include patients who missed doses. 


+A total of 19 patients received all 14 doses but 1 patient missed the concentration measurement at Week 40 and so n=18 for Cinax.ss at 3mg kg! aducanumab. 


{The observed post-infusion concentrations at Week 40 were reported as steady-state Cmax. 
AUC, area under the concentration curve; Cmax,ss, maximum concentration at steady state; PK, pharmacokinetic; SD, standard deviation. 
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Extended Data Table 4 | Change from baseline in amyloid PET SUVR values, CDR-SB, and MMSE at the end of the placebo-controlled period 
by absence/presence* of ARIA-E 


Treatment group 


(# without ARIA-E, # with ARIA-E) 


Aducanumab 
Adjusted mean + SE for: Placebo 1 mg kg" 3 mg kg" 6 mg kg" 10 mg kg" 
Amyloid PET SUVR valuest (30, 0) (21, 0) (24, 2) (17, 6) (13, 8) 
ARIA-E 
Absence 0.003 + 0.020 —0.056 + 0.024 —0.141 + 0.023 —0.243 + 0.027 —0.278 + 0.031 
Presence 0.001 + 0.020 - —0.069 + 0.075 —0.114 + 0.049 —0.263 + 0.040 
CDR-SBt (31, 0) (23, 0) (25, 2) (18, 8) (14, 9) 
ARIA-E 
Absence 1.84+0.42 1.72 +0.48 1.33 + 0.47 1.11+0.54 0.78 + 0.61 
Presence 1.95+0.35 - 2.04 + 1.38 1.18 + 0.73 0.67 + 0.67 
MMSE# (32, 0) (25, 0) (24, 2) (18, 8) (16, 9) 
ARIA-E 
Absence —2.86 + 0.69 —2.20 + 0.77 0.47 + 0.80 —1.82 + 0.91 —1.05 + 0.96 
Presence —2.60 + 0.69 - -3.41 + 2.69 —1.95 + 1.42 0.83 + 1.35 


*Since there were no ARIA-E events in the placebo group, the overall placebo group was used as the comparator in the subgroup analysis for presence of ARIA-E. 


TAt week 54. 

tAt week 52. 

Analyses based on observed data. Adjusted mean change and standard errors are based on an ANCOVA model for change from baseline with factors of treatment, laboratory ApoE <4 status (carrier 
and non-carrier), and baseline composite SUVR, CDR-SB, or MMSE, respectively. ARIA-E, amyloid-related imaging abnormalities (oedema); CDR-SB, Clinical Dementia Rating—Sum of Boxes; MMSE, 


Mini-Mental State Examination; PET, positron emission tomography; SE, standard error; SUVR, standard uptake value ratio. 
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The epiblast (EPI) is the origin of all somatic and germ cells in mammals, and of pluripotent stem cells in vitro. To explore 
the ontogeny of human and primate pluripotency, here we perform comprehensive single-cell RNA sequencing for pre- 
and post-implantation EPI development in cynomolgus monkeys (Macaca fascicularis). We show that after specification 
in the blastocysts, EPI from cynomolgus monkeys (cyEPI) undergoes major transcriptome changes on implantation. 
Thereafter, while generating gastrulating cells, cyEPI stably maintains its transcriptome over a week, retains a unique 
set of pluripotency genes and acquires properties for ‘neuron differentiation’. Human and monkey pluripotent stem cells 
show the highest similarity to post-implantation late cyEPI, which, despite co-existing with gastrulating cells, bears 
characteristics of pre-gastrulating mouse EPI and epiblast-like cells in vitro. These findings not only reveal the divergence 
and coherence of EPI development, but also identify a developmental coordinate of the spectrum of pluripotency among 
key species, providing a basis for better regulation of human pluripotency in vitro. 


The early embryonic development in mammals is decisive for the 
development of the pluripotent lineage, the EPI, which generates all 
of the somatic and the germ cell lineages!”. EPI is also a source of 
pluripotent stem cells (PSCs) in vitro, including embryonic stem cells 
(ESCs) derived from the pre-implantation EPI? and epiblast stem 
cells derived from the post-implantation EPI>°®, which, together with 
induced pluripotent stem cells (iPSCs)”*, are critical resources for a 
broad range of biomedical applications’. 

Mammalian development has been studied almost exclusively using 
mice as a model organism. However, the mechanisms for mammalian 
development, including EPI development, are divergent among 
species!°. Consequently, PSCs derived from EPI also have divergent 
properties; whereas mouse (m) ESCs have a naive pluripotency with 
unbiased potential for differentiation and chimaera contribution, 
human (h) or primate ESCs show similarity to mouse epiblast stem 
cells and exhibit a primed pluripotency with biased differentiation 
capacity and limited potential for chimaera contribution!!. For 
better understanding of hPSCs, investigation into the mechanism for 
human/primate embryonic development is critical. Such investigation, 
however, has been hampered owing to difficulties in analysing human 
and primate early post-implantation embryos. To gain insight into the 
mechanism of embryonic development in humans and primates and 
the properties of hPSCs, we performed a comprehensive analysis of 
the single-cell transcriptome during pre- and early post-implantation 
development in cynomolgus monkeys (Macaca fascicularis), a primate 
closely related to humans and suitable for biological experimentation 
(Extended Data Fig. 1a). 


Pre- and post-implantation development 

We isolated the metaphase-II-stage oocytes, fertilized them and 
cultured the resultant embryos’? (Fig. 1a, Extended Data Fig. 1b-e). 
The blastocysts were observed from embryonic day (E) 5 (Fig. la, 
Extended Data Fig. 1d). We examined the expression of key markers" 
by immunofluorescence analysis. Nearly all of the cells in blastocysts 


at E6 were positive for OCT4, whereas NANOG was more restricted 
(Fig. 1b). From E7, OCT4 appeared to decline in outer cells, whereas 
NANOG was confined to a subset of OCT4* inner cell mass (ICM) 
(Fig. 1b). At E9, both OCT4 and NANOG were confined to ICM cells 
(Fig. 1b, Extended Data Fig. 1f). Unlike in mice, but like in humans), 
CDX2 was undetectable in blastocysts at E6, and became evident in 
outer cells at E7 (Extended Data Fig. 1f, g). Whereas GATA4 became 
detectable around ICM from E7 onwards (Fig. 1b, Extended Data 
Fig. 1f, g), GATA6 was widely expressed in blastocysts at E6 and in all 
cells except OCT4* ICM from E7 onwards (Extended Data Fig. 1f, h). 
GATA4 and CDX2 showed mutually exclusive expression (Extended 
Data Fig. 1g), whereas GATA6 was expressed in virtually all of the 
CDX2* cells (Fig. 1c). TFAP2C exhibited strong, weak, and no 
expression in GATA6T outer cells, OCT4*GATA6~ ICM, and GATA6t 
ICM, respectively (Extended Data Fig. 1i). Thus, the ICM versus 
trophectoderm specification may manifest relatively late and the EPI 
versus hypoblast specification takes place at around E7 via a unique 
mechanism (Extended Data Fig. 1)). 

We isolated early post-implantation embryos (E13, E14, E16, and 
E17) with ethical approval (Methods, Extended Data Fig. 2a, b). 
Consistent with previous literature on rhesus embryos'*"!” (Extended 
Data Fig. 2c), implanted embryos exhibited disc-shaped, columnar 
EPI continuous with squamous amnionic cells (Fig. 2a, b, Extended 
Data Fig. 2d). Beneath EPI were a basement membrane and 
visceral endoderm, which was continuous with yolk-sac endoderm. 
EPI, amnion, and visceral/yolk-sac endoderm (VE/YE) were embedded 
in extra-embryonic mesenchyme (Fig. 2b, Extended Data Fig. 2d). 
Gastrulating cells above visceral endoderm were apparent in an E16 
embryo (Fig. 2b). EPI was positive for OCT4 and NANOG, and VE/ 
YE was positive for GATA4 and GATA6, and the gastrulating cells were 
weakly positive for OCT4 and positive for T, a marker for primitive 
streak/incipient mesoderm (Fig. 2c, d, Extended Data Fig. 2e, f). The 
continued expression of NANOG in EPI until late after implantation 
was a marked difference from the immediate repression of NANOG 
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Figure 1 | Monkey pre-implantation development. a, Images of monkey 
pre-implantation embryos. Oocytes were fertilized around noon and the 
embryos were observed every day until E8 (embryos, n = 68). AM, 9:00; 
PM, 21:00. Embryos with more than 16 cells without blastocoel cavities, 
those with blastocoel cavities, and those with cells outside the zona 
pellucida were classified as morula, blastocysts and hatched, respectively. 


after implantation in mice!*®. Extra-embryonic mesenchyme cells were 
positive for GATA4 and GATA6 (Extended Data Fig. 2f), consistent 
with their derivation from hypoblast/VE/YE”. 


Lineage delineation by transcriptome 

We prepared single-cell cDNAs from pre- and post-implantation 
embryos using the single-cell mRNA 3-prime end sequencing 
(SC3-seq) method”? and screened all of the cDNAs for key markers by 
quantitative PCR (qPCR) (Extended Data Fig. 3). For pre-implantation 
cells (Extended Data Fig. 3b), at E6, POU5F1 (gene encoding OCT4) 
was expressed ubiquitously, and NANOG exhibited more heterogenous 
expression. With the six markers (POU5F1, NANOG, CDX2, TFAP2C, 
GATA4 and GATA6), we could not discriminate trophectoderm until E7. 
We detected low levels of GATA4 in some cells as early as E6. At E7, 
GATA4" cells were negative for NANOG expression, suggesting 
hypoblast specification. At E8 and E9, the cells were classified into three 
types: NANOG"™; GATA4- (EPI), NANOG'*-; GATA4" (hypoblast), 
and NANOG"-; GATA4-; GATA6/CDX2/TFAP2C" (trophectoderm), 
indicating a clear segregation of the three lineages. 

For post-implantation embryos (Extended Data Fig. 3c), we gener- 
ated cDNAs from cells expressing key pluripotency markers (POU5F1, 
NANOG, SOX2, PRDM 14), representing EPI and its related lineages, 
as well as cDNAs from cells negative for such markers, but positive 
for GATA4, representing VE/YE and its related lineages. Notably, 
not only NANOG, but also PRDM14 (refs 20, 21), continued to show 
high expression until E17. Some of the cells that were positive for 
pluripotency markers were positive for Tas early as E13, and cells with 
expression of T continued to express POU5F1, but tended to repress 
NANOG, SOX2, and PRDM14, particularly after E16, indicating 
their mesodermal or endodermal identity. We isolated cells that 
were negative for the pluripotency markers and GATA4, but positive 
for GATA2, presumably representing post-implantation parietal 
trophectoderm. We also isolated candidates for primordial germ cells 
(SOX17*PRDM1*TFAP2C* and SOX2_), which will be analysed 
separately (data not shown). 

We analysed the transcriptome of 390 cells (pre, 193; post, 197) 
(Extended Data Fig. 3, Supplementary Table 1) by SC3-seq, the 
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301 cells (DAPI) 


b, Expression of NANOG/OCT4/GATA4 from E6 to E9 (embryos n= 10, 
6, 5, 4, respectively). The numbers of cells positive for each marker 

are indicated. Insets are higher magnifications of ICM. c, Expression 

of GATA6/CDX2 at E8 (embryos n= 5). Yellow arrowheads, GATA6* 
(hypoblast); white arrowheads, GATA6* CDX2* (trophectoderm). Scale 
bars, 100 1m. 


performance of which was better than or comparable to those of 
other single-cell RNA-seq methods!®?-*4 (Extended Data Fig. 4). 
Unsupervised hierarchical clustering (UHC) classified all the cells into 
two large clusters, one consisting mainly of pre-implantation and the 
other only of post-implantation cells (Fig. 3). The principal component 
analysis (PCA) and a correlation analysis provided consistent outcomes 
(Extended Data Fig. 5a, b). On the basis of the markers and the cells’ 
developmental stages, we annotated pre-implantation cells at E6 as 
either undifferentiated ICM or pre-implantation early trophectoderm 
(preE-TE), cells at E7 as either pre-implantation epiblasts (pre-EPI), 
preE-TE, pre-implantation late TE (preL-TE) or hypoblast, and cells 
at E8 and E9 as either pre-EPI, preL-TE, or hypoblast (Fig. 3, Extended 
Data Fig. 5a, b). PCA revealed a close relationship between ICM and 
preE-TE, a progressive transition from ICM to pre-EPI and from 
preE- to preL-TE, and a distinct property of hypoblast (Extended 
Data Fig. 5c). The differentially expressed genes among the annotated 
lineages are listed in Supplementary Table 2. Similarly, we annotated 
the post-implantation cells as the post-implantation early or late EPI 
(postE-EPI or postL-EPI), gastrulating cells 1, 2a and 2b (Gast1, 2a 
and 2b), VE/YE, and extra-embryonic mesenchyme (EXMC), and a 
distinct group of cells clustered with preL-TE as post-implantation 
parietal trophectoderm (Fig. 3, Extended Data Fig. 5a, b). 

POUS5F1 and NANOG were highly expressed in post-EPI (postE-EPI 
and postL-EPI) and Gast1, whereas SOX2 and PRDM 14 were down- 
regulated in a number of Gast1, and postL-EPI and Gast1, respectively 
(Extended Data Fig. 5d). T showed high expression in Gast] and Gast2a 
as well as in some of post-EPI, but was low/negative in Gast2b, VE/YE, 
and extra-embryonic mesenchyme (Extended Data Fig. 5d). GATA4 and 
GATA6 were expressed in Gast1, Gast2a and Gast2b as well as in VE/YE 
and extra-embryonic mesenchyme, except that GATA6 was negative 
in a majority of Gast (Extended Data Fig. 5d). We identified FOXA1 
as a gene specifically expressed in VE/YE but not in extra-embryonic 
mesenchyme, and reciprocally, COL6A1 showed strong expression in 
extra-embryonic mesenchyme, but weak in VE/YE (Extended Data 
Fig. 5d, e). T, MIXL1, and CDX2 were low/negative, but markers for 
epithelial-mesenchymal transition were high in extra-embryonic 
mesenchyme (Extended Data Fig. 5d). The genes highly expressed 
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Figure 2 | Monkey early post-implantation development. a, A post- 
implantation embryo at E14 observed under a dissection microscopy. 
b, Haematoxylin and eosin staining of the sections of post-implantation 
embryos at E13 and E16. c, d, Expression of OCT4/NANOG (c), and 
OCTA4/T (arrowheads) (d) in embryos at E14 and E16 (embryos n = 2, 2, 


in extra-embryonic mesenchyme were enriched with those bearing 
Gene Ontology terms such as ‘extracellular matrix’ (Extended Data 
Fig. 5f-h, Supplementary Table 2), and some of them exhibited specific/ 
high expression in hypoblast (Extended Data Fig. 5g). Conversely, some 
of the genes that were highly expressed in hypoblast remained high in 
extra-embryonic mesenchyme (Extended Data Fig. 5i, j, Supplementary 
Table 2). Extra-embryonic mesenchyme exhibited transcriptional 
heterogeneity similar to postE-EPI (Extended Data Fig. 5k). Thus, 
trophectoderm and hypoblast are specified by E6 and E7, respectively. 
EPI originates from ICM at E6 and through implantation, progressively 
acquires distinct properties, while generating gastrulating cells from 
around E13. Extra-embryonic mesenchyme cells appear to derive from 
hypoblast through epithelial-mesenchymal transition and generate 
abundant extracellular matrix (Extended Data Fig. 51). 


Transition of pluripotency properties 
We focused on the transition of the properties of EPI. The expression 
of key genes associated with naive and primed pluripotency during 


2,000 


BE6 GE13 


respectively). When recognizable, anterior is to the left. AM, amnion; CT, 
cytotrophoblast; CS*, future connecting stalk; EXMC, extra-embryonic 
mesenchyme; EXCM, exocoelomic membrane; Gast, gastrulating cells; 
SYS, secondary yolk sac; VE, visceral endoderm; YE, yolk-sac endoderm. 
Scale bars, 100 um. 


EPI development is shown in Extended Data Fig. 6a. The major 
changes occur during the ICM to pre-EPI and the pre-EPI to postE- 
EPI transitions with relatively large numbers of differentially expressed 
genes, whereas post-EPI stably maintains its properties (Extended 
Data Fig. 6b-d, Supplementary Table 2). Consequently, we found that 
post-EPI acquire genes associated, most prominently, with ‘neuron 
differentiation/development, whereas they downregulate genes 
associated with ‘regulation of cell proliferation and ‘mitochondrion/ 
oxidative reduction’ (Extended Data Fig. 6c, Supplementary Table 2). 
We confirmed robust expression of SOX11, a marker for neural tubes 
in mice’, in E14 and E16 post-EPI (Extended Data Fig. 6e). The genes 
upregulated in Gast1 or Gast2a exhibited a marked enrichment for 
‘pattern specification process/embryonic morphogenesis’ (Extended 
Data Fig. 6c, Supplementary Table 2). 

To identify genes characteristic of the transition of EPI, we performed 
PCA of the EPI lineage, and determined the 776 genes with significantly 
positive or negative scores of the principal component 1/2 (PC1/2) 
loading (cyEPI ontogenic genes) (Fig. 4a, b, Supplementary Table 2). 


wE7 
BES 
mE 


BE4 
WE16 
mEI7 


Euclidian distance 
1,000 


c 


390 cells, 18,353 genes 


LL CA A 
GAPDH 
POUS5F1 
IANO! 
PRDM14 
DPP, af 
GATAG | I] A fa 


PDGFRA ae 

GATA4 

eet ah i iM is | =H 
STs a i | wl LL heen l 


Figure 3 | Classification of key cell types by SC3-seq. Unsupervised 
hierarchical clustering (UHC) with all expressed genes (390 cells, 
18,353 genes) and a heat map of the levels of selected marker genes. 
Colour bars under the dendrogram indicate embryonic days (top) 
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and cell types (bottom), respectively. G2a/G2b, Gast2a/Gast2b; 
pa, parietal; pre, pre-implantation; post, post-implanatation; 

TE, trophectoderm. Other abbreviations are as indicated in Fig. 2. 
The colour coding is as indicated. 
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Figure 4 | Progressive transitions of the properties of cyEPI. a, PCA 

of the EPI lineage by all expressed genes among these groups (213 cells, 
17,193 genes). b, Scatter plot of the normalized loading scores of PCA in 
a. Orange dots (776 genes: cyEPI ontogenic genes (Supplementary 

Table 2)) indicate genes that contributed highly to the PC1 and PC2 axes: 
more than 3 s.d. radius of PC1 and PC2 (rPC12>SD3). Key genes are 
annotated. c, Heat map of the expression of cyEPI ontogenic genes. The 
genes were ordered by UHC, and eight clusters were defined according to 
the UHC dendrogram (left). Representative genes and key Gene Ontology 
enrichments are shown (Supplementary Table 2). RPM, reads per million 
mapped reads. 


The UHC classified them into clusters exhibiting characteristic 
expression in the seven cell types (Fig. 4c). Clusters 1 and 2 were 
expressed in pre-implantation cells with cluster 1 being upregulated 
in Gast2a/2b, and were enriched with genes for cell morphology or 
extracellular matrix, and for ‘transcription factor activity, including 
naive pluripotency genes (KLF5, TBX3, ESRRB, TFCP2L1, and SOX15) 
(Fig. 4c). Clusters 3 to 5 exhibited expression linking pre- and post- 
EPI, and their genes were linked with ‘stem cell maintenance, including 
KLF4, PRDM14, NANOG, and SOX2 (Fig. 4c). Clusters 6 and 7 showed 
expression mainly in post-implantation cells with cluster 6 being down- 
regulated in Gast2b, and exhibited a striking enrichment for ‘neuron 
differentiation’ (Fig. 4c). Genes expressed in postE-EPI were typically 
expressed in Gast1 (Fig. 4c). Cluster 8 was typically upregulated in Gast 
cells and enriched with genes for ‘gastrulation’ (Fig. 4c). Thus, cyEPI 
ontogenic genes distinguish the spectrum of pluripotency during EPI 
development. 


Cynomolgus versus mouse EPI development 

To compare cynomolgus and mouse EPI development, we analysed 
single-cell cDNA from EPI (Pou5f1*) and visceral endoderm 
(Pou5fl~ Gata4* at E5.5; PouSfl~ Afp* at E6.5) of E5.5 and E6.5 mouse 
embryos by SC3-seq (Extended Data Fig. 7a, b, Supplementary Table 1). 
We also analysed the data for EPI, trophectoderm, and primitive 
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endoderm of E4.5 embryos'®. UHC classified these cells in a stage- 
dependent manner and PCA revealed a directional progression of 
EPI development, with E6.5 EPI exhibiting a scattered distribution, 
indicating their heterogeneity (Extended Data Fig. 7c, d). Around 
half of E6.5 EPI were highly positive for T (E6.5EPI-T"), and they 
were enriched with genes for ‘embryonic morphogenesis/pattern 
specification process’ (Extended Data Fig. 7e, f, Supplementary Table 2). 
mEPI changed their properties more rapidly and profoundly than 
cyEPI (Extended Data Fig. 7g). The genes upregulated during the E4.5- 
E5.5 transition were enriched for ‘apoptosis/programmed cell death 
and ‘regulation of transcription, but unlike in cynomolgus monkeys, 
not for ‘neuron differentiation, whereas those downregulated were 
enriched for ‘stem cell maintenance’ (Il6st, Lifr, Nanog, Sox2, Esrrb, 
KIf4/5, Tbx3, Nr5a2) (Extended Data Fig. 7g). 

PCA revealed that the cynomolgus and mouse cells exhibited a 
separation along the PC1, which represents a major species difference 
(Extended Data Fig. 8a). We identified genes with significantly posi- 
tive or negative PC1 scores, representing the cynomolgus and mouse 
genes, respectively, which were enriched for ‘mitochondrion/oxidative 
reductiom and ‘neuron projection morphogenesis’ (cynomolgus genes), 
and ‘ubiquitin mediated proteolysis/cell cycle/DNA repair’ (mouse 
genes) (Extended Data Fig. 8b, Supplementary Table 2). The enrichment 
of cell-cycle related genes among mouse genes was consistent with 
the rapid proliferation of mouse embryonic cells. Notably, along the 
PC2 and 3 axes, the cynomolgus and mouse cells were plotted in a 
manner reflecting their developmental transitions (Extended Data 
Fig. 8a). We determined the genes with significantly positive or negative 
scores for PC2 and 3 loading, and we subtracted from them the genes 
that had significantly positive or negative scores for PC1 loading. 
A correlation analysis using the resulting gene set (monkey and mouse 
common EPI genes, 473 genes, Extended Data Fig. 8c, Supplementary 
Table 2) revealed that postE-EPI, postL-EPI, and Gast1 exhibited 
the closest correlation with E5.5 EPI, whereas Gast2a/2b showed the 
closest correlation with E6.5EPI-T*! (Extended Data Fig. 8d), defining 
an approximate developmental coordinate between the two species. 
Thus, cyEPI retain a property of pre-gastrulating mEPI long after 
implantation (~1 week). 

To gain insight into a mechanism for the acquisition of the ‘neuron 
differentiation property in monkey post-EPI, we performed the 
KEGG (Kyoto Encyclopedia of Genes and Genomes) pathway 
analysis for genes upregulated during the pre-EPI to postE-EPI 
transition, which revealed an enrichment of the NOTCH signalling 
pathway (Extended Data Fig. 8e-g). Constitutive activation of the 
NOTCH signalling in m/hPSCs or mouse embryos predisposes them 
for neuronal differentiation, with downregulation of the NODAL 
signalling”*’”. Consistently, NODAL was downregulated in post-EPI 
(Extended Data Fig. 8f). Contrastingly, the NOTCH signalling pathway 
was not upregulated in E5.5 mEPI, and Nodal, which prevents 
precocious neural differentiation”®, was strongly expressed in E5.5 
mEPI (Extended Data Fig. 8e, f). Thus, the differential NOTCH 
signalling may create a difference between cynomolgus and mouse 
post-implantation EPI. 


Developmental coordinate of pluripotency 

We next sought to compare PSCs in vitro with the EPI lineage, 
and performed SC3-seq for cyESCs”? (Extended Data Fig. 9a-c, 
Supplementary Table 1). cyESCs were clustered together with postE- 
EPI, postL-EPI, and Gast1, but were highly distinct from ICM and pre- 
EPI; note that one cyESC was clustered within postL-EPI (Extended 
Data Fig. 9d). Consistently, PCA plotted cyESCs closest to postL-EPI 
(Fig. 5a). When compared with pre-EPI, cyESCs up- and downreg- 
ulated as many as 520 and 394 genes, respectively, whereas when 
compared with postL-EPI, cyESCs up- and downregulated only 170 
and 26 genes, respectively (Fig. 5b, Supplementary Table 2). The genes 
upregulated in cyESCs against pre-EPI were enriched with those for 
‘neuron differentiation, and less significantly those for cell cycles 
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Figure 5 | Correlations among cells during cyEPI development and 
hPSCs/cyPSCs. a, PCA of monkey embryonic cells and cyESCs by all 
expressed genes among these groups (244 cells, 17,450 genes). The colour 
coding is as indicated. b, Differentially expressed genes between cells 
during the cyEPI development and cyESCs. Top, Orange and blue bars 
indicate the numbers of up- and downregulated genes, respectively, in the 
pair-wise comparisons indicated. Bottom, Representative Gene Ontology 
terms for genes upregulated in cyESC against postL-EPI (Supplementary 


(Extended Data Fig. 9e, f). The genes upregulated in cyESCs against 
postL-EPI were enriched with those for ‘actin cytoskeleton, suggesting 
an adaptation in culture (Fig. 5b, Supplementary Table 2). Thus, cyESCs 
bear properties corresponding to postL-EPI, which co-exists with 
gastrulating cells and is a precursor for gastrulation. 

Contrastingly, mESCs with a ‘ground state’ of pluripotency*?*! 
were similar to E4.5 mEPI, whereas epiblast-like cells induced from 
mESCs* were highly similar to E5.5 mEPI (Extended Data Fig. 10a-e); 
the number of differentially expressed genes between epiblast-like 
cells and E5.5 mEPI was much smaller than that between mESCs 
and E4.5 mEPI (Extended Data Fig. 10f, Supplementary Table 2). 
Cross-species comparison using the monkey and mouse common 
EPI genes revealed that cyESCs were correlated with E5.5 mEPI and 
epiblast-like cells, and reciprocally, epiblast-like cells were correlated 
with postL-EPI and cyESCs (Extended Data Fig. 10g, h). Interestingly, 
mESCs were closer to postE-EPI than to pre-EPI (Extended Data 
Fig. 10g, h). 

We next examined the relationship between hiPSCs’® (Extended 
Data Fig. 9c) and the cyEPI lineage. As there were still considerable 
species differences between humans and cynomolgus monkeys, we 
used the cyEPI ontogenic genes. Remarkably, hiPSCs exhibited a profile 
highly similar to that of cyESCs, and accordingly, to that of postL-EPI 
(Fig. 5c, Extended Data Fig. 9g, h). We confirmed that human pre- 
implantation EPI (and also marmoset ICM (EPI and hypoblast))??-~4 
were similar to cynomolgus pre-EPI, but were distinct from post-EPI 
(Extended Data Fig. 9g, h), and interestingly, human pre-implantation 
EPI cultured for the establishment of hESCs acquired a similarity to 
monkey post-EPI at passage 0 (Extended Data Fig. 9g, h). We examined 
the properties of hPSCs and cyPSCs that have been previously reported 
to show naive pluripotency****. Notably, hESCs reported in refs 35-37 
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Homo sapiens 

cyESC 
Table 2). c, Heat map of the correlation coefficients among the indicated 
cells including those reported by others**-** based on the cyEPI ontogenic 
gene levels. The genes in common for all platforms were used (628 out of 
776 genes). N, ‘naive’; C, conventional. d, A model for a developmental 
coordinate of the spectrum of human, monkey and mouse pluripotency. 
Black arrows, direction of differentiation/derivation; red arrows, 
homologous relationships. 


exhibited the closest correlation with pre-EPI, whereas hESCs in 
ref. 34 and cyESCs in ref. 38 remained essentially unchanged from their 
original states and were similar to postL-EPI, and hESCs reported in 
ref. 33 showed the closest correlation with Gastl1, reflecting their 
expression of genes for gastrulation (Fig. 5c, Extended Data Fig. 9g). 


Discussion 

Through the systematic use of SC3-seq, we have established a transcrip- 
tional foundation for the origin and development of cyEPI, defining 
a developmental coordinate of pluripotency among mice, monkeys, 
and humans (Fig. 5d). After implantation, while generating gastru- 
lating cells, cyEPI maintains its transcriptional property stably for a 
prolonged period (~1 week), suggesting that cyEPI could be a source 
for the cells of the same lineages for an extended period. The finding 
that cy/hPSCs bear transcriptional properties similar to postL-EPI and, 
to a slightly lesser extent, to postE-EPI, is consistent with the fact that 
hPSCs are induced into cells in the three germ layers and primordial 
germ cell-like cells*?“°, although cy/hPSCs lack naive pluripotency and 
show line-dependent differentiation biases”*!. cyEPI ontogenic genes 
(Fig. 4b, c) were highly instructive for defining the properties of hPSCs 
(Fig. 5c); considering the controversial state of naive hPSCs'° and a 
higher similarity of mESCs to postE-EPI than to pre-EPI (Extended 
Data Fig. 10g, h), exploring a condition that provides hPSCs with 
a property more similar to postE-EPI might be one strategy for 
improving the potentials of hPSCs. It should also be interesting to 
compare monkey post-EPI with the human epiblast after ‘implantation 
culture’ of human blastocysts**’. The future challenges will include 
exploring the mechanism for monkey lineage specification as well as 
for the maturation of cyEPI, and performing more comprehensive 
analysis for monkey gastrulation. Such investigation will lead to a better 
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strategy for controlling the properties of hPSCs and for generating cells 
of interest from hPSCs. 
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METHODS 


Data reporting. No statistical methods were used to predetermine sample size. The 
experiments were not randomized. The investigators were not blinded to allocation 
during experiments and outcome assessment. 

Animals. Experimental procedures using cynomolgus monkeys were approved 
by the Animal Care and Use Committee of Shiga University of Medical Science. 
The procedures in cynomolgus monkeys for oocyte collection, intra-cytoplasmic 
sperm injection, pre-implantation embryo culture, and transfer of pre-implantation 
embryos into foster mothers were performed as described previously with minor 
modifications!”. For super-ovulation, ovarian stimulation with follicle-stimulating 
hormone (Gonapure, ASKA) was performed by embedding an implantable 
and programmable micro-fusion device (iPRECIO; Primetech Corporation) 
subcutaneously. The day when the intra-cytoplasmic sperm injection was 
performed was designated as embryonic day (E) 0. Embryos with more than 
16 cells without blastocoel cavities, those with blastocoel cavities, and those with 
cells outside the zona pellucida were classified as morula, blastocysts, and hatched, 
respectively. The progression of the pre-implantation development of cynomolgus 
monkeys was highly similar to that of rhesus monkeys", but was somewhat slower 
than that of humans‘. For embryo transfer, 4 to 5 two-cell to blastocyst-stage 
embryos were selected and transferred into an appropriate recipient female. For 
the detection of pregnancy of early post-implantation embryos, implanted embryos 
were monitored by ultrasound scanning around E13 and the implanted uterus was 
surgically removed and bisected for the isolation of embryos. 

Experimental procedures using mice were performed under the ethical 

guidelines of Kyoto University. For isolating mouse embryos, C57BL/6 mice were 
mated, and noon of the day when a copulation plug was identified was designated 
as E0.5. We did not determine the sex of embryos. 
Immunofluorescence analysis. For monkey pre-implantation embryos, whole 
embryos were fixed in 4% paraformaldehyde in PBS (N26126-25; Nacalai Tesque) 
for 20 min at room temperature, washed in 2% BSA/PBS and incubated in the 
permeabilization solution (0.5% Triton X (T9284; Sigma-Aldrich)/1.0% BSA 
(A4503; Sigma-Aldrich)/PBS (20012-027; Thermo Fisher Scientific)) for 20 min 
at room temperature. After washing twice with 2% BSA/PBS, the embryos were 
incubated with primary antibodies in 2% BSA/PBS overnight at 4°C, washed 
three times with 2% BSA/PBS, incubated with secondary antibodies and 
4,6-diamidino-2-phenylindole (DAPI) in 2% BSA/PBS for 1h at room temperature, 
washed three times with 2% BSA/PBS, and mounted in VECTASHIELD mounting 
medium (H-1000; Vector Laboratories). Image data were captured and processed 
by a confocal microscope (Olympus FV1000 or Zeiss LSM780). 

For monkey post-implantation embryos, implantation sites were dissected out 

from the uterus and fixed in 10% formalin (37152-51; Nacalai Tesque) overnight 
at 4°C. The samples were embedded in paraffin and sectioned at a thickness of 
2-4\1m. Each slide was treated with HistVT ONE (06380-05; Nacalai Tesque) 
according to the manufacturer's instructions and incubated in blocking solution 
(2% BSA/PBS). For primary antibody reaction, sections were incubated in the 
blocking solution with primary antibodies at 4°C overnight. After washing six 
times with PBS, sections were incubated in the blocking solution with secondary 
antibodies and DAPI for 1h at room temperature, washed four times with PBS-T 
(0.05% Tween 20 (P9416; Sigma-Aldrich) in PBS), washed twice with PBS, and 
mounted in VECTASHIELD mounting medium (H-1000; Vector Laboratories). 
Image data were captured and processed by a confocal microscope (Olympus 
FV1000 or Zeiss LSM780). After acquisition of the immunofluorescence image, 
sections were re-stained with haematoxylin and eosin for histological analysis. For 
Fig. 2c, d, Extended Data Fig. 2c, several images were acquired for one large area, 
and merged using the Photomerge function of Photoshop CC (Adobe Systems). 
All the antibodies used in this study are listed in Supplementary Table 1 along with 
the information on dilution ratios. 
Cell culture. CMK6 and CMK9 were gifts from H. Suemori”. For cultivation 
on feeders, they were cultured with conventional hESC medium (DMEM/F12 
(D6421; Sigma-Aldrich) supplemented with 20% (vol/vol) of KSR (10828-028; 
Thermo Fisher Scientific), 1 mM of sodium pyruvate (11360-070; Thermo Fisher 
Scientific), 2mM of GlutaMax (35050-061; Thermo Fisher Scientific), 0.1 mM 
of non-essential amino acids (11143-050; Thermo Fisher Scientific), 0.1 mM of 
2-mercaptoethanol (M3148; Sigma-Aldrich), 1,000 U ml! of ESGRO mouse LIE 
(ESG1107; Millipore), and 4ng ml! of recombinant human bFGF (060-04543; 
Wako Pure Chemical Industries)) on mouse embryonic feeders (MEFs). For 
feeder-free cultivation, cyESCs were cultured under the same condition as hiPSCs, 
as described previously**®. The cultivation of mESCs and the induction of day 2 
epiblast-like cells were performed as described previously”. All of the cell lines 
used in this study have been tested for mycoplasma contamination by MycoAlert 
(LT07-118; Lonza Japan), according to the manufacturer’s instructions. 
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Isolation of single cells for single-cell cDNA preparation. For monkey pre- 
implantation embryos, a whole embryo was incubated with 0.25% trypsin/PBS 
(T4799; Sigma-Aldrich) for around 10 min at 37°C, then dissociated into single 
cells by repeated pipetting, and dispersed in 0.1 mg ml“! of PVA/PBS (P8136; 
Sigma-Aldrich) for preparation of single-cell cDNAs. 

For monkey post-implantation embryos, the implantation site was dissected 
out from the uterus and the embryonic fragment containing the epiblast (EPI), 
amnion, hypoblast, and yolk-sac endoderm was isolated manually. The fragment 
was incubated with 0.25% trypsin/PBS for around 10 min at 37°C, then dissociated 
into single cells by repeated pipetting, and dispersed in 0.1 mg ml! of PVA/PBS. 

For mouse E5.5 and E6.5 embryos, the embryos were dissected out from decidua 
and the extra-embryonic ectoderm was removed manually. The EPI/visceral 
endoderm were incubated in 0.25% Pancreatin (P3292; Sigma-Aldrich)/0.5% 
Trypsin/Polyvinylpyrrolidone (PVP40; Sigma-Aldrich), and EPI and visceral 
endoderm were separated by mild pipetting. For E5.5 embryos, EPI and visceral 
endoderm were incubated with 0.25% trypsin/PBS separately for around 10min 
at 37°C. For E6.5 EPI, their proximal parts were dissected out manually, and the 
proximal EPI and whole visceral endoderm were incubated with 0.25% trypsin/ 
PBS (Sigma-Aldrich) separately for around 10 min at 37°C. The proximal EPI and 
visceral endoderm were dissociated into single cells by repeated pipetting, and 
dispersed in 0.1 mg ml“! of PVA/PBS (Sigma-Aldrich). 

For cyESCs, cells were first detached as clumps with CTK solution (0.25% of 
trypsin (15090-046; Thermo Fisher Scientific), 0.1 mg ml“! of collagenase IV 
(17104-019; Thermo Fisher Scientific), and 1 mM of CaCl, (06729-55; Nacalai 
Tesque)), incubated in TrypLE Select (12563029; Thermo Fisher Scientific) for 
around 10 min at 37°C, and dispersed into single cells in 1% (vol/vol) KSR/ 
PBS containing 10|1M of the ROCK inhibitor Y-27632 (257-00511; Wako Pure 
Chemical Industries). Cells under feeder-free condition were directly incubated 
in TrypLE Select for around 5 min at 37°C, and dispersed into single cells in 1% 
(vol/vol) KSR/PBS containing 10|1M of the ROCK inhibitor Y-27632. 

For mESCs and epiblast-like cells, cells were incubated in TrypLE Select for 
around 5 min at 37°C, and dispersed into single cells in 1% (vol/vol) KSR/PBS. 
Single-cell cDNA preparation and transcriptome analysis by SC3-seq. CDNA 
synthesis and amplification from isolated single cells were performed essentially as 
described previously!?*”“*. Two types of spike-in RNAs—that is, the four Bacillus 
subtilis mRNAs, lys, dap, phe and thr, used in ref. 47 and 48, and the mRNAs 
developed by the External RNA Controls Consortium (ERCC; Life Technologies 
(4456740))—were used as described in ref. 19. 

Before the construction of the SC3-seq library, the quality of the amplified 
cDNAs was evaluated by examining the C;, values of the qPCR of several endoge- 
nous genes, and by examining the distribution of the lengths of cDNA fragments 
using a LabChip GX (CLS760672; Perkin Elmer) or Bioanalyzer 2100 (5067-4626; 
Agilent Technologies) system. qPCR was performed using Power SYBR Green 
PCR Master Mix (4367659; Life Technologies) with a CFX384 real-time qPCR 
system (Bio-Rad, Hercules, CA) according to the manufacturer’s instructions. The 
primer sequences are listed in Supplementary Table 1. Most of the primer sets were 
designed using Primer-Blast (NCBI) within a distance of 500 base pairs (bp) from 
the transcription termination sites (TTSs). 

SC3-seq libraries of quality checked cDNAs were constructed as described 

previously’’. The quality and quantity of the constructed libraries were evaluated 
by using a LabChip GX or Bioanalyzer 2100 system, a Qubit dsDNA HS assay 
kit (Q32851; Life Technologies), and a SOLiD Library TaqMan Quantitation kit 
(4449639; Life Technologies). The clonal amplification of the libraries on beads by 
emulsion PCR (emPCR) was performed using a SOLiD EZ Bead System (4449639; 
Life Technologies) at the E120 scale according to the manufacturer’s instruction. 
The resulting bead libraries were loaded onto flowchips and sequenced for 50 bp 
and 5 bp barcode plus Exact Call Chemistry (ECC) on an SOLiD 5500XL system 
(4449639; Life Technologies). 
Mapping reads of RNA-seq and conversion to gene expression levels. The 
genome sequence (GRCm38/mm10 for mice, GRCh37/hg19 for humans, and 
MacFas5.0 for cynomolgus monkeys (Macaca fascicularis)) and the transcript 
annotation (GRCm38 for mice, GRCh37 for humans, and MacFas5.0 for 
cynomolgus monkeys) were obtained from the NCBI ftp site at ftp://ftp-trace. 
ncbi.nlm.nih.gov/genomes/Macaca_fascicularis. SC3-seq reads only the 3’ end of 
transcripts, so that the expression levels were calculated as genes (Entrez genes) 
but not mRNAs. Read trimming, mapping and estimation of expression levels were 
performed as described previously!?’. 

For mapping of the full-length RNA-seq data obtained externally, reads of data 
from??-?433-38-49 were mapped onto the human (hg19), cynomolgus (MacFas5.0), 
marmoset (calJac3), or mouse (mm10) genome, using TopHat v2.0.11 (ref. 50), 
respectively. Mapped data were converted into expression levels using cufflinks 
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v.2.2.0 (ref. 51) with the ‘“-compatible-hits-norm -library-type fr-unstranded- 
max-mle-iterations 50000” options. 

Data analysis of the SC3-seq. Data analysis was performed using R software 
version 3.1.1 with the gplots (ver. 2.16.0), qvalue (ver. 1.40.0), rgl (ver. 0.95.1201), 
vioplot (ver. 0.2), and genefilter (ver. 1.48.1) packages, and EXCEL (Microsoft), as 
described previously’. All the analyses of expression data were performed using 
logo(RPM-+1) values. We defined ‘all expressed genes’ as genes whose logo(RPM-+1) 
values were >4 (greater than ~10-20 copies per cell, which is a lower limit 
for reliable and reproducible detection by our single-cell cDNA amplification 
method!*”) in at least one sample. Unsupervised hierarchical clustering (UHC) 
was performed using the hclust function with Euclidian distances and Ward’s 
method (ward.D2). The principal component analysis (PCA) was performed using 
the prcomp function without scaling. To identify differentially expressed genes 
(DEGs) among multi-groups, the kruskal.test function for the Kruskal-Wallis test 
and the q value function were used for the calculation of the P value and false 
discovery ratio (FDR)*’, respectively. The DEGs were defined as the genes 
exhibiting more than fourfold changes between the samples (FDR < 0.01), and 
the mean of the expression level of the group was >log»(RPM-+1) = 4. For the 
Gene Ontology and Kyoto Encyclopedia of Genes and Genomes** (KEGG) 
analyses using the DAVID web tool”, since the annotation of Macaca fascicularis 
genes was relatively incomplete, human annotation corresponding to that of 
cynomolgus monkeys was used. For this purpose, a one-to-one correspondence 
table of genes was made by genomic coordinate comparison using the LiftOver 
tool, as described previously’ (Supplementary Table 2). 

Comparison of gene expression between cynomolgus monkeys and humans, 
and between cynomolgus monkeys and mice. For comparison of the gene 
expression between cynomolgus monkeys and humans, common genes listed in the 
cynomolgus monkeys—humans one-to-one annotation table were used. Specifically, 
there are 28,551 genes in MacFas5.0 and 22,577 genes in GRCh37, with 17,542 
genes in common between the two (Supplementary Table 2). For a comparison 
between cynomolgus monkeys and mice, first, a humans—mice annotation list was 
generated, and then a cynomolgus monkeys—humans list and mice-humans list 
were combined using human gene identifiers. There are 24,216 genes in GRCm38 
and 15,933 genes in the mice-humans combined list, and consequently, 15,220 
genes in the cynomolgus monkeys—humans-mice gene list (Supplementary Table 2). 
Analysis of published expression data for human PSCs. Expression levels of 
RNA-seq data were calculated as described above, and those of microarray 
data***> were obtained from a series matrix sheet in the GEO repository (NCBI). 
For data processing, expression levels of RNA-seq data were transformed 
into log»(fragments per kilobase of exon per million mapped sequence reads 
(FPKM) + 0.1), and those of microarray data were transformed into log,(intensity). 
For comparison of microarray data to the SC3-seq data, the highest intensity probes 
were used for genes with multiple probes. 


Accession numbers. Accession numbers for the data generated in this study and 
for the published data used in this study are as follows. The SC3-seq data in this 
study: GSE74767; those of mouse E4.5 cells and human iPSCs!?: GSE63266; the 
transcriptome data for ref. 22, GSE36552; ref. 23, GSE66507; ref. 33, GSE46872; 
ref. 34, E-MTAB-2031; ref. 36, E-MTAB-2857; ref. 37: E-MTAB-4461; ref. 35: 
GSE59430; ref. 24: E-MTAB-2958, E-MTAB-2959; ref. 38: GSE69708; and ref. 49: 
GSE45916. The samples used are listed in Supplementary Table 1. 
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Extended Data Figure 1 | See next page for caption. 
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Extended Data Figure 1 | Summary of monkey pre-implantation 
development. a, A phylogenic tree of primates*°. Cynomolgus, rhesus, 
and Japanese monkeys are members of macaques of Ceropithecoidae, 
which are classified as old-world monkeys. b, Summary of super- 
ovulation and oocyte collection for this study. c, Summary of monkey pre- 
implantation development. No insemination, embryos without cleavage; 
degenerated, degenerated embryos; arrested, embryos that failed to form 
a blastocoel; blastocyst formation, embryos with blastocoel formation by 
E8. d, Developmental progression of monkey pre-implantation embryos. 
Embryos with more than 16 cells without blastocoel cavities, those with 
blastocoel cavities, and those with cells outside the zona pellucida were 
classified as morula, blastocysts, and hatched, respectively. e, The cell 
numbers (counted by DAPI) of pre-implantation embryos from 

E6 to E9. The cells with degenerated nuclei were excluded. The colour 
coding is as indicated. f, Scatter plots of the cell numbers that were 
positive for each marker (y axis, the colour coding indicated) against 

the whole-cell numbers (x axis). Each plot indicates the numbers 

in one embryo. The orange and red bars indicate the range of 

embryonic days and developmental stages, respectively. g, Expression 


of CDX2/OCT4/GATA4 from E6 to E9 (embryos n =7, 6, 12, 12, 
respectively). The numbers of cells positive for each marker are 

indicated. h, Expression of GATA6/OCT4 from E6 to E9 (embryos n= 4, 
13, 11, 3, respectively). The numbers of cells positive for each marker 

are indicated. i, Expression of TFAP2C/OCT4/GATA6 at E8 (embryos 
n=5). ICM is magnified (right). Arrowheads indicate hypoblast. 

j, Summary of the expression of key markers in monkey, human and 
mouse pre-implantation embryos. Blastocysts of the three species show 
grossly similar morphology, but notably, monkey hypoblast extends 
parietally to cover mural trophectoderm. OCT4 expression appears to 

be equal in EPI and hypoblast of mouse blastocysts”, whereas OCT4 is 
expressed at a higher level in EPI than in hypoblast and trophectoderm in 
human and monkey blastocysts'*?. NANOG and GATA4 exhibit a similar 
expression pattern among the three species*” °*. CDX2 shows a similar 
expression in human and monkeys, but an earlier expression in morula in 
mice!**°, GATA6 exhibits the most variable expression pattern among the 
three species: it is expressed only in hypoblast in humans and mice*”»”, but 
is uniformly expressed in hypoblast and trophectoderm in monkeys. Scale 
bars, 100 1m. 
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Extended Data Figure 2 | See next page for caption. 
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Extended Data Figure 2 | Monkey early post-implantation 
development. a, Ultrasound diagnosis of the recipient uterus for the 
implantation of transplanted embryos at E14 and E16. Dashed circles 
indicate the uterus and arrowheads indicate the chorionic cavity. 

Scale bars, 10 mm. b, Implantation (white arrowheads) and pseudo- 
implantation (black) sites on the recipient endometrium. The image 

at right is a higher magnification of the area boxed on the left. The 
implantation site was identified by maternal blood in the trophoblastic 
lacunae. The pseudo-implantation site was a reacted endometrium to the 
implantation on the overlying endometrium. Scale bar, 2mm. c, Scheme 
of monkey early post-implantation development. AM, amnion; 

CS, connective stalk; CT, cytotrophoblast; EXCM, exocoelomic membrane; 
EXMC, extra-embryonic mesenchyme; Gast, gastrulating cells; 


SYS, secondary yolk sac; VE, visceral endoderm; YE, yolk-sac endoderm; 
TE, trophectoderm. d, Lower magnification images of Fig. 2b showing 
whole implantation sites at E14 (left) and E16 (right). PYS/CC, primary 
yolk sac/chorionic cavity. Scale bars, 500 jum (left) and 1.0 mm (right). 

e, Expression of OCT4/NANOG/GATA4 and OCT4/T/GATAG in post- 
implantation embryos at E16 (embryos n = 2). Gastrulating cells positive 
for T and OCT4 migrated along visceral endoderm. Some cells (yellow 
arrowhead) showed ingression into visceral endoderm (white arrowhead). 
Scale bars, 100m. f, Expression of OCT4/GATA4 (left) and OCT4/ 
GATA6 (right) in embryos at E14 (embryos n= 2). Arrowheads indicate 
extra-embryonic mesenchyme. Scale bars, 100 1m. 
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Extended Data Figure 3 | Expression of key markers in single-cell 
cDNAs generated from monkey pre- and post-implantation embryos. 
a, Summary of the SC3-seq samples. The numbers of embryos analysed 
(all of the embryos exhibited normal morphology), of synthesized 
cDNAs with appropriate quality, and of the cells analysed by SC3-seq are 
listed. b, c, PCR analysis of the expression of key markers in single-cell 


=Bi -4 ie} 
cDNAs generated from pre-implanation (E6, E7, E8, E9) (b) and post- 
implantation (E13, E14, E16, E17) (c) embryos. The AC, values from the 
average C; values of GAPDH and PPIA are shown as heat maps and are 
used for clustering. The identities of the embryos and the samples used 
for the SC3-seq analyses (annotations are based on Fig. 3a) are indicated. 
The colour coding is as indicated. 
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Extended Data Figure 4 | Comparison of the performance of the SC3- 
seq with that of other RNA-seq and single-cell RNA-seq methods. 

a, Distributions of the expression levels of all annotated genes by the 
SC3-seq (diluted total RNA”, single cells; in this study) (left) and other 
methods***? (right) represented by violin plots. Medians are shown 

by white circles. For the SC3-seq, the transcript-level (log»(RPM-+1)) 
distributions of 100 ng, 100 pg, and 10 pg of total RNA or single cells from 
2i+L mESCs (cultured in N2B27 medium supplemented with a cytokine 
leukaemia inhibitory factor (LIF) and two kinase inhibitors (PD0325901 
and CHIR99021))** as starting materials are shown. For refs 24 and 49, 
the transcript levels are shown as logo(FPKM-+0.1). Transcripts from 

20 cells of 2i+L mESCs are amplified in ref. 24, and in ref. 49, 500 ng 

of polyA RNA from serum/LIF mESC/iPSCs are used for a standard 
RNA-seq procedure. b, Distributions of the expression levels of genes 


expressed at significant levels (log(RPM+1) > 4 in at least one 100 ng 
RNA sample) among the same sample set in a. c, Comparisons of the 
distributions of the expression levels among corresponding samples in 
different data sets’? **. The geneset used in the first row consisted of all 
annotated genes. In the second, third, and fourth rows, the genesets used 
were genes expressed at significant levels as in b in at least one cell in the 
monkey pre-EPI group, in marmoset ICM samples, and in human 

EPI samples” respectively. d, Scatter-plot analysis of the correlation 
between the expression level and the transcript length detected by the 
SC3-seq (2iLESC_MS68T82)!’, (GSM1119616)*”, and (ERR637931)"4. 
Note that the method in ref. 24 tends to yield lower estimations of the 
levels of longer transcripts compared to the SC3-seq 

and standard RNA-seq. 
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Extended Data Figure 5 | See next page for caption. 


© 2016 Macmillan Publishers Limited, part of Springer Nature. All rights reserved. 


ARTICLE 


Extended Data Figure 5 | Characterization of extra-embryonic 
mesenchyme. a, PCA of all cells by all expressed genes (390 cells, 18,353 
genes). The colour coding is as indicated. b, Heat map of correlation 
coefficients among cells during monkey development. The values were 
calculated using the averaged expression levels of 6,167 DEGs in each cell 
type (the genes exhibiting more than fourfold changes among the groups 
(FDR < 0.01), and the mean of the expression level of at least one group 
was >(log»(RPM+1) =4). c, PCA of cells from pre-implantation embryos 
by all expressed genes among these groups (193 cells, 15,187 genes). 

d, Scatter-plot comparison of the expression of key genes with that of 
POUSF1 in post-implantation cells. e, Expression of COL6A1/FOXA1 in 
embryos at E14 and E16 (embryos n = 2, 2, respectively). Scale bars, 

100 jim. f, Venn diagram showing the overlap among the genes expressed 
at higher levels in EXMC (>4-fold) compared to postE-EPI, Gastl, 


or post-implantation parietal trophectoderm at E13 and E14. g, Heat map 
of the expression of the 228 genes identified, as in f, among monkey pre- 
and post-implantation cell types. h, Enrichment of Gene Ontology terms 
in the 228 genes identified, as in f. i, Venn diagram showing the overlap 
between the genes expressed at higher levels in hypoblast (>4-fold) 
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mice (top) or for the differentiation of hypoblast into VE/YE or extra- 
embryonic mesenchyme in cynomolgus monkeys (bottom). 


© 2016 Macmillan Publishers Limited, part of Springer Nature. All rights reserved. 


PRDM14 TFCP2L1 


KLF2 


KLF4 KLFS 


KLF17 ZFP42 TBX3 


ESRRB 


DNMT3L 


ARTICLE 


Naive 
NROB1 


UTF1 SOX15 


| 


e 


ef 
rt ae 


| Pe 
Me 


FGFS FGF2 


DNMT3B 


POU3F1 


POUSF1 


Core Pluripontecy 


NANOG 


log.(RPM+1 


a 


i"? sl 


@icm 

@ Pre-EP| 
@ PostE-EPI 
@ PostL-EPI 
@ Gast1 
@Gast2a 

@ Gast2b 

— median 


b 
ICM > Pre-EPI | 
Pre-EP] > PostE-EPI | 
PostE-EP! > PostL-EPI | 
PostE-EPI > Gast1 
PostL-EPI > Gast2a | 
¢ 


im @ 


Lysosome 


response to retinoic acid 
TFRC, CDKN2D, AQP3, KLF4 


| HYAL1, CTSL2, PPT2, CTSB, ATP6V0A4, ATP6VOD2 
apical plasma membrane 


Pre-EP| @ 


iL6R, CTSB, SLC22A5, ATP6VOA4, ATP6VOD2 
regulation of cell proliferation 


FGFR2, HYAL1, TBX3, IL6R, TRIB1, LAMA1, HNF4A, FXN, BTG2, 
CDKN2D, DLG3, LAMB1, KLF4 


up-regulated 
400 


cell morphogenesis 


a 


[log(RPM+1)] 


Pre-EPI 


KLK8, PDPN, PTPRZ1, NODAL, UCHL1, MAP1B, CXCL12, SEMAGA, IGF1R, DAB2, APP, 
UNCS5B, KAL1, GBX2, TGFBR3, ETV4, MYH10, FN1, FEZ1 


cell motion 


neuron development 


KAL1, GBX2, ETV4, MYH10, FEZ1 


KLK8, PTPRZ1, UCHL1, MAP1B, CXCL12, SEMA6A, IGF1R, APP, UNC5B, 


DNAH11, MTSS1, NODAL, CALD1, ANXA1, MYLIP, SiX4, DNAH7, CXCL12, SEMA6A, APP, 
UNCS5B, CTGF, KAL1, GBX2, HSPB1, TGFBR3, PPAP2A, ETV4, MYH10, FN1, FEZ1 


regulation of anti-apoptosis 
DUSP1, BTG2, IL6ST, HMOX1, TNFAIP8, RTKN, ERC1, SLC9A1 


mitochondrion 


SUCLA2, AGK, TXNIP, GPD2, CHDH, AIFM2, LYN, CYP11A1, MCAT, CBR4, 
TOPIMT, CLPX, SLC25A12, NNT, SLC25A13, PTRF, BBC3, DSP, PERP, TRIT1 
regulation of cell proliferation 


Pre-EP| @ 


NODAL, SPHK1, KLF11, S100A11, ANXA1, BRCA2, SPARC, MYCN, TNS3, 
BTG2, HNF4A, IRF6, BTG3, FABP3, LAMC1,NR5A2, KLF4 


oxidation reduction 


PostE-EP| @ 


GPD2, GPD1, CHDH, CYB5R1, ME2, PTGR2, AIFM2, CYP11A1, EHHADH, 
TMX4, EGLN3, CBR4, SLC25A12, FRRS1, FMOS, HMOX2, NNT, SLC25A13, 


HMOX1, HSD17B6, ERO1L, LOXL2, ALDH9A1, GLRX 


neuron differentiation 


neuron development 

VCAN, PBX3, APBB1, IGSF9, CfORF187, MYH10 
axonogenesis 
neuron projection 


IGSF9, DBN1, APBB1, MYH10 


GNAO1, NNAT, CLU, UCHL1, DPYSL5, L1CAM, CNP, GLI2, THY1, SALL3, ALCAM, 
PHGDH, VCAN, SMARCAT1, PBX3, APBB1, 1GSF9, MYH10, NOTCH3 


GNAO71, CLU, UCHL1, DPYSL5, L1CAM, CNP, GLI2, THY7, SALL3, ALCAM, PHGDH, 


JUB, GABRB3, PDGFA, UCHL1, FERMT1, L1CAM, GL/2, ALCAM, SPRY1, CRMP1, 
SPP1, CCDC88A, ALCAM, UCHL1, DPYSL5, L1C AM, VCAN, CNP, GLI2, APBB1, MYH10 


PTPRF, GABRB3, FBXO2, UCHL1, DPYSL5, L1CAM, THY1, ALCAM, CRMP1, 


PostE-EP| [loga(RPM+1)] 


PostE-EPI @ 


PostL-EP! @) 


Poste-EP| @ 


| 
Gast1 @ 


pattern specification process 
gastrulation 
embryonic morphogenesis 


mesoderm development 
BMP4, T, TWSG1, NODAL, VEGFA, EOMES, LEFt 


BMP4, TWSG1, NODAL, GATA4, EOMES, LEF1, M/XL1 


BMP4, DLC1, NODAL, EOMES, LEF1, MIXL1, MSX2, T, CHD7, CDON, GATA4, SP& 


T, BMP4, CDX1, EVX1, NODAL, FST, LEF1, DISP1, VEGFA, GATA4, SP&, AXIN2, PITX2 


ECM-receptor interaction 


ITGA6, ITGA7, COL1AT, 
developmental induction 


PostL-EP! Q) 


BMP4, SOX2, FGF2 


THBS2 


| cell-cell signaling involved in cell fate specification 


BMP4, SOX2, FGF2 


Gast2a @ 


cell motion 


vasculature development 
BMP4, AGGF1, ID1, MYOTE, ITGA7, COL1AT, FGF2, CXCL12 


IGSF8, UNCS5B, ITGA6, D1, MYLIP, PPAP2A, FGF2, CXCL12, C1ORF187 


E14 


E16 


merge 


Extended Data Figure 6 | DEGs during the cyEPI development. 

a, Expression of selected genes during EPI development (black bars, 
median values). b, DEGs during the cyEPI development. Orange and blue 
bars indicate the numbers of up- and downregulated genes, respectively, 
in the pair-wise comparisons indicated. c, Enrichment of Gene Ontology 
terms and representative genes (all genes are shown in Supplementary 
Table 2) in DEGs in the pair-wise comparisons indicated. d, Scatter-plot 
comparison of the averaged gene-expression levels between ICM and 


gastrulation 


CER1, GSC, FOXA2, LHX1, NODAL, GATA4, EOMES, LEF1, CHRD, MIXL1 


pattern specification process 


CERT, T, HHEX, GSC, FOXA2, CXCR4, LHX1, NODAL, GATA4, LEF1, AXIN2, CHRD, PITX2 


embryonic morphogenesis 


1 
PostL-EPI [loga(RPM+1)] 


DLC1, CER1, GSC, FOXA2, NODAL, EOMES, LEF1, MIXL1, T, DKK1, LHX1, ROR2, CHRD 


anterior/posterior pattern formation 


CER1, MTSST, PLXNA3, PLXNA2, NODAL, VIM, OTX2, NFASC, KIT, CDH2, EPHB3, COL5AT, 
MIXL1, EPHA4, CERT, T, HHEX, FOXA2, LHX1, NODAL, GATA4, LEF1, ROR2, AXIN2 


endoderm development 
LHX1, GATA6, NODAL, EOMES, MIXL1 


DAPI OCT4 


SOX11 


© 2016 Macmillan Publishers Limited, part of Springer Nature. All rights reserved. 


Correlation coefficient = 0.9667 
14] 304 genes 


KHDGSL 
12 
Jbgaest weet: \, POUSFI 
49 SPARC. DPPAE.™'” ¢ Zoppas 
“e JARID2, * > SY 
9g SOX2> 1 I ara 
Re las 
TFOP 2+” 
o" , CTSL2 


ie) 137 genes 
0 2 4 6 8 10 12 14 
ICM — flogo(RPM+1)] 


Correlation coefficient = 0.9485 
14] 258 genes 


L0C101867055 porert 


C) . * é(NLRP9 442 genes 
0 2 #4 #6 86 10 12 14 
Pre-EPI [loga(RPM+1)] 


Correlation coefficient = 0.9829 
14] 24 genes 


12 
POUSFI 
e 
40 SFRPL, / |Z 
aR? oA 4» NANOG 
7 7 *- SOx2 
gue «-DPPAS 
6 Lee dof os 
4 amp 
TROP2L I> = 
o| aa RiP eS) NLRP7 


DNMT3L___85 genes 


0 2 4 6 8 10 12 14 
PostE-EPI flogg(RPM+1)] 


pre-EPI (top), pre-EPI and postE-EPI (middle), and postE-EPI and 
postL-EPI (bottom). Orange, upregulated; blue, downregulated (>4-fold 
difference (flanking diagonal lines), mean log,(RPM+1) > 4 in one cell 
type, FDR <0.01). Key genes are annotated and the numbers of DEGs are 
indicated. The correlation coefficient is indicated above the scatter plot. 
e, Expression of SOX11/OCT4 in embryos at E14 (top) and E16 (bottom) 
(embryos n = 2, 2, respectively). Scale bars, 100 1m. 


ARTICLE 


iz] 
aud 


13,761 genes, 86 cells 


400 


Euclidian distance 


F6.5 


a cDNA SC3-seq 

E4.5* 67 37 

E5.5 86 25 

E6.5 101 24 

Total 254 86 

“ Nakamura et al. 2015 (NAR) 
E5.5 86 cells 
MES.5EPI 
b MES5VE 
Lau 1 aot 
Pousti (iil OER CNANOUROBENSN OMMNOURORCOON A BO7R0) 0 ONCMOOR 


WE6.SEPI-T”? 
M@E6.5EPI-7 
E65 VE 


nt 
vere 
Gates 
Afp 


E6.5 101 cells 


fog 

Pou6ft unin 
Nanog 

AA 
Gata4 


Afp 


logo(RPM+1) 


-10 -8 -6 -4 -2 
18 
ie © 2) poudtt 
100 ie} -100 ane 
205 E6.5EPI-7” A > oe ee e E6.5EPI-T” 
. if @ E6.5EPI-T”™ 
e 
% oo 6 
ee ee 
PC2 0} HS.5EPI £6.5EPI-T? 
@ 
' q eb we i q { 
100 0 4 8 120 4 8 120 4 8 120 4 8 12 
T [log.(RPM+1)] 
g num. of genes 


negative regulation of transcription 
DNMTS3A, SIN3B, CTBP2, ZFP281, 
DNMT3B, DAXX, KLF4, FOXD3,,, 
chromatin 
DNMT3A, TRIM24, DNMT3B, ,,, 
transcription cofactor activity 
UTF1, SIN3B, CTBP2, BCL11A,SKI,,, 
mitochondrial membrane 
FIS1, ATAD3A, UCP2, COXS5A,,,, 


embryonic morphogenesis 
FGFR1, TBX6, TCF7, WNT3A, 
MIXL1, HOXB1, MESP1... 
formation of primary germ layer 
TWSG1, TBX6, WNT3A, MESP1, MIXL1 
pattern specification process 


cell fate commitment 
WNT3A, DLL1, GAS1, ITGB1, FGF3,,, 


Extended Data Figure 7 | See next page for caption. 


HOXB1, WNT3A, FST, CYP26A1, PBX1,,, 


down-regulated 


600 400 200 


E4.5EPI > E5.5EPI 
E5.5EPI > E6.5EPI-T” 
E5.5EPI| > E6.5EPI-T” 


E4.5EPI > E5.5EPI 

stem cell maintenance 

NANOG, ESRRB, SOX2, KLF4, FGF4, TCL1 
extracellular matrix 

NID1, NID2, LAMA1, LAMC1, COL6A3, FN... 
regulation of cell proliferation 

IL6ST, LIFR, BMP4, PDGFA, FGF4, 

NANOG, SOX2, KLF4, KLF5, TBX3, NR5AZ... 


E5.5EPI > E6.5EPI-T° 
nucleolus 
TAF1B, KRR1, SDAD1, RRP12, TAF1A, RDM1... 
cell cycle 


E2F1, POLAT, CDC7, PDPN, NOTCH2, TIMELESS... 


response to DNA damage stimulus 
POLAT, CHEK1, ATR, XRCC1, RAD5O... 
mitochondrion 


SPATAS, COX17, COX6C, SLC25A32, SLC25A33... 


E5.5EPI > E6.5EPI-T” 

mitochondrion 

COX17, COX16, PDK1, COX6C, UCP2, BAX... 
chromatin binding 

DNMTS3A, TRIM24, SUV39H2, CHD9, AIRE... 
cell cycle 

E2F3, POLAT, MYB, CDCA5, CDK7, MAPK12... 
RNA processing 

RNMT, CRNKL1, SNRPD3, ZCRB1, PRPF19... 


© 2016 Macmillan Publishers Limited, part of Springer Nature. All rights reserved. 


up-regulated 
400 


(0) 200 600 


E4.5EPI < E5.5EPI 
apoptosis, programmed cell death 
BID, BCL2L11, AKT1, CASP3, SULF1, BMF... 
regulation of transcription 
SOX12, SOX3, SOX4, LIN28B, POU3F1, 
DNMT3B, DNMT3A, SALL2... 


E5.5EPI < E6.5EPI-T® 
antigen binding 
IGH, IGK, GJ, H2-AA, H2-AB1 
immune response 
APOA4, H2-EB1, CD79A, B2M... 


E5.5EPI < E6.5EPI-T” 

gastrulation 

TBX6, HAND1, EOMES, AMOT, MESP7 
embryonic organ development 

SPRY2, HOXB1, CHD7, KRT8, TCF7L2... 
pattern specification process 

EVX1, FST, HOXB1, RFX3, SP8, AXIN2... 
neuron differentiation 

SALL3, EVX1, EOMES, NEUROD4, DLL1... 


Extended Data Figure 7 | SC3-seq analyses of cells in mouse pre- and 
post-implantation embryos. a, Summary of the SC3-seq samples of 
mouse embryonic cells. The numbers of synthesized cDNAs of appropriate 
quality and of the cells analysed by SC3-seq are listed. b, qPCR analysis of 
the expression of key markers in single-cell cDNAs from mouse E5.5 
(pre-gastrulation) and E6.5 (early/mid-streak stage) embryos. The AC, 
values from the average C, values of Gapdh and Arbp are shown as heat 
maps and are used for clustering. For E5.5 embryos, cells were picked 
from EPI and visceral endoderm. For E6.5 embryos, cells were picked 
from proximal EPI and visceral endoderm. The samples used for the 
SC3-seq analyses (the annotations are based on c) are indicated. c, UHC 
of cells from mouse E4.5 (EPI, primitive endoderm, mural trophectoderm 
(mTE), and polar trophectoderm (pTE)”’), E5.5, and E6.5 embryos by all 
expressed genes (logo(RPM-+1) > 4 in at least one sample among 86 cells, 
13,761 genes), and heat map of the levels of selected marker genes. Colour 
bars under the dendrogram indicate the cell types. Orange and green 


ARTICLE 


bars in the E6.5 EPI cluster indicate E6.5EPI-T” (green) and E6.5EPI-T™ 
(orange) cells, respectively. See e for details. d, PCA of cells by all expressed 
genes among the indicated cells (67 cells, 13,761 genes). e, Scatter-plot 
comparison of the expression of key genes for pluripotency or primitive 
streak formation against that of T in E6.5 EPI. Cells were classified 

by the levels of T (orange, E6.5EPI-T"; green, E6.5EPI-T’”). f, Heat map 
of the levels of genes with correlation or anti-correlation with T in E6.5 
EPI. Genes were selected as follows: the levels > logo(RPM-+1) =6 in at 
least one cell, correlation coefficient with T > 0.6 (102 genes) or <—0.6 
(99 genes). Enrichment of Gene Ontology terms and representative 

genes are indicated. g, DEGs during the mEPI development. Top, Orange 
and blue bars indicate the numbers of up- and downregulated genes, 
respectively, in the pair-wise comparisons indicated. Bottom, Enrichment 
of Gene Ontology terms and representative genes in DEGs 

in the pair-wise comparisons indicated. 


© 2016 Macmillan Publishers Limited, part of Springer Nature. All rights reserved. 


ARTICLE 


a 


PostE-EPI 


i 
Es S5EPI-T?” 


-40 
i 
i 
i 


-80 


Cynomolgus 


Cynomolgus 
EPI 


Gast 


Mouse 
EPI 


lo w ESS 


Cynomolgus 
EPI 


PostE 


Mouse 


PostL 


EPI 


Gast 


E45 E5.5 


ICM 


Pre PostE PostL 


1 


2a 


E6.S5EPI 


EPI 


EPI 


je 


phi 


1 2a 


2b 13 perhi 


Pre PostE PostL 


CC 
PC1 > 2SD; 446 genes i 
= mitochondrion 
GLRX5, CAV7, AASS, SFXN4, 
oxidation reduction 
LDHB, TMX1, HSD17B12,, 
neuron projection morphogenesis 
ALCAM, SLC26A6, PTPRZ1,,, 
cell projection morphogenesis 
L1CAM, VCAN, BMP7, CXCL72,,,, 
axonogenesis 
ALCAM, LICAM, CXCL12,, 
apical junction complex 
OCLN, DSG2, CGN, DSC3, CXADR,,, 
; cell-cell junction 
GUAT, CLDN22, CDH3, CLDN15,,, 
cytoskeletal protein binding r 
MYO1E, IQGAP2,CTNNAT, ANXA2,,, 


PC1 <-2SD; 338 genes 


| eed 0.116 -0242 -0.133 -0229 -0.154 0239 -0.019 0016 0.135 


EPI 


Correlation coefficient 


Gast 


Mouse 
E4.5EPI E5.5EPI 


Cynomolgus 
PostE-EPI| 


DLLI 
DLL3 
DLL4 
JAGI 
JAG2 
NOTCH1 
NOTCH2 
NOTCH3 
NOTCH4 
RBPJ | 
MAML1 
MAML2 
MAML3 
HES! mi 
HESS nn P| 
LENG 1 till a 


NODAL | \l) <b i Re I 
Extended Data Figure 8 | Comparison of monkey and mEPI 
development. a, PCA of cells during monkey (circles, colour-coded as 
in Fig. 3a) and mouse (squares) EPI development. Orthologues among 
humans, cynomolgus monkeys and mice were annotated (15,220 genes), 
and 13,473 genes expressed among these cells (monkey, 213 cells; mouse, 
44 cells) were used for PCA. b, Heat map of the expression of 784 genes 
that contributed highly to the PC1 axis (>2 s.d. of PC1: cyEPI or mEPI 
genes (PCA as in a)). The genes are ordered by UHC, and representative 
cyEPI or mEPI genes and their key Gene Ontology enrichments are 
shown. c, Heat map of the levels of monkey and mouse common EPI genes 
(473 genes) (defined as: radius of PC2 and 3 > 3 s.d. and of PC1: —2 s.d. 


log,(RPM+1) 


biquitin mediated proteolysis 
ANAPC2, XIAP. UBE2J2, AIRE,,, 
roteolysis 
RNF168, DPP9, CUL4B, USP45,,, 
; cell cycle 
CDC23, ATM, EIF4G2, MAPK3.,, 
M phase 
SPC24, HAUS4, HAUS3, NDE1,,, 
DNA repair 
POLD3, XRCC1, ATM, FANCC.,. 
induction of apoptosis 
PCBP4, MCF2, DEDD, BAX, ATM,.,, 
logo(RPM+1) rPC23 > SD3, delta PC1SD2; 473 genes 
ie) 2 4 6 8 10 12 14 
e 
Gene set KEGGID KEGG pathway P-value 
4 Cynomolgus hsa04330 Notch signaling pathway 9.89E-03 
Pre-EPI < PostE-EPI hsa04510 Focal adhesion 9,99E-03 
258 genes hsa05218 Melanoma 3.89E-02 
05 mmu04115 p53 signaling pathway 2.78E-03 
: mmu05210 Colorectal cancer 8.22E-03 
mmu04010 MAPK signaling pathway 1.98E-02 
mmu05217 Basal cell carcinoma 2.59E-02 
0 M mmu04664 Fe epsilon RI signaling pathway 2.63E-02 
ea gree pe sep] MMU0S014 Amyotrophic lateral sclerosis (ALS) 2.91E-02 
" 455 7 mmu00270 Cysteine and methionine metabolism 2.96E-02 
ae genes mmu05200 Pathways in cancer 3.28E-02 
q mmu04310 ‘Wht signaling pathway 3.32E-02 
mmu05215 Prostate cancer 3.72E-02 
mmu05416 Viral myocarditis: 4.36E-02 
=A mmu04920 Adipocytokine signaling pathway 4.84E-02 


9g 


HNP! 


< PC1 <2:s.d.) during monkey and mEPI development. d, Heat map of 
correlation coefficients among cells during cyEPI and mEPI development. 
The values were calculated using the averaged expression level of monkey 
and mouse common EPI genes (473 genes (c, Supplementary Table 2)). 

e, Signalling pathways enriched in genes upregulated during the pre-EPI 
to postE-EPI transition (top, 258 genes) or during the E4.5-E5.5 mEPI 
transition (bottom, 455 genes) by the KEGG pathway analysis. f, Heat 
map of the expression changes of key genes in the NOTCH pathway 

and NODAL/Nodal during upon implantation. g, A proposed pathway 
operating in monkey post-EPI, which acquires a property for ‘neuron 
differentiation. NICD, NOTCH intracellular domain. 


—_ EPI upon implantation 


oS © 
=) 


= 


) == 


A property 
associated with 
“Neuron 
differentiation” 


Maintain 
pluripotency 


© 2016 Macmillan Publishers Limited, part of Springer Nature. All rights reserved. 


ARTICLE 


a c 
cell condition cDNA SC3-seq 
- oO 
3 £ CMK9 __ on feeder ug 8 
baal: 5 cyESC CMK6 on feeder 34 14 31 
c oO 
2 2 CMK6 feeder free 19 9 
585A1__ feeder free 42 8 
b CMK9_on feeder ars 585B1__— feeder free 43 Te 36 
; 585A1 on feeder 13 v4 
Hanne 585B1 on feeder 14 7 
PRDM14 ] Total 182 60 
delta Ct valve * Nakamura et al. 2015 (NAR) 
-10 -8 -6 -4 -2 
d 
2,000 mes ME13 cyESC oF 
Me? ME4 micyESCFF 421 samples, 18,527 genes 
8 Mes MEG 
g Meo MEI7 
00 
3 F 
: pe : 
Lua int AT i TT ini i 
ada aT TTT Ia ATT TAT Im 
Day 0 
Group 
Post PreL-TE Hypoblast PreE-TE ICM Pre-EPI cyESC PostE-EPI PostL-EPI Gast1 G2a G2b EXMC. 
TE y! 
pa 
e f =o 
cyESC > Pre-EPI; 520 genes P-value 
1602 16-04 16-06 1.£-08 14 
call projection i 
neuron differentiation ' 12 
neuron development H Hiicu 
neuron projection ! 40 |_| Pre-EPI 
neuron projection development \ 
cell projection organization ! S| IB Posté-EP! 
cell morphogenesis involved in neuron differentiation 1 2/8 { PostL-EPI 
regulation of kinase activity i a Cacti 
axonogenesis : Ble BB Gas 
muscle cell differentiation d 2s {i Gast2a 
regulation of cell proliferation i 4 | Gast2b 
cell cycle arrest ' I cyEsc 
positive regulation of cell proliferation i 2 
cell proliferation ' 
~~ G0:0008284 G0:0008285 9 
Positive regulation of cell proliferation Negative regulation of cell proliferation 
Cynomolgus monkey Hs embryo / ESC ‘Marmoset 
oO Be a hsESC/IPSC cyESC 
iS) = Ref. 23 Ref. 24 S| i 
g s _ Pil __—Gast_ = 9 2 hsESC Ref.34 Ref35  Ref.o7 Ref6 Ref38 SC3-seq {logs(RPM+1)] 
2 Pre PostE PostL 1 abl | = = 
0 4 8 12 
Ref. 22,34 _[log.(FPKM+0.1)] 
0 4 8 
Ref. 23 [log:(FPKM+0.1)] 
4 #0 4 8 12 
Ref. 24, 36, 
37,38 __[logo(FPKM+0.1)] 
ki a 
0 4 8 
Ref. 33 [log2(intensity)] 
| 
7 8 "1 13 
Ref. 35 [loga(intensity)] 
4 8 12 
=| = 
Cy-Hs-Cj common orthologs; 628 / 776 genes 
h Hs embryo / ESC 
Cynomolgus monke' 
ee i Ref. 22 hsESC/PSC cyESC 
EPI Gast 9 2 Ref.33 Ref. 34 Ref. 35 Ref37 Ref.36 Ref. 38 
a eee ae w 
© Pre PostEPostL 1 2a 2b 3 8 NC 
= 10 
ICM 0.08 0.22 -0.08 -0.18 -0.11 0.17 -0.07 0.24 
Pre-EPI 0.05 0.10 0.16 0.18 0.25 S/tos 
3 
PostE-EPI 0.17 0.16 0.17 0.20 0.08 0.11 012 0.11 0.19 3 
i 8 
PostL-EPI 0.13 0.11 0.11 0.11 0.09 0.07 0.19 0.07 0.22 0.06 0.04 -0.02 -0.04 0.06 0.23 0.20 & Q 
& 
Gast 0.19 0.16 0.17 0.17 0.17 0.15 0.24 0.16 (029 016 0.15 0.11 0.08 0.18 2 
Si} 05 
Gast2a 0.16 0.13 0.13 0.12 0.08 0.04 0.07 0.16 0.09 0.18 0,04 0,00 -0.02 0.08 016 0.11 © 
Gast2b J} 0.11 0.10 0.15 0.13 0.11 0.09 0.10 0.06 0.05 0.06 0.23 0.05 0.24 0.15 0.00 0.03 0.06 0.24 0.10 035 0 


Extended Data Figure 9 | Correlations between hPSCs and cyPSCs 

and cells during cyEPI development. a, Morphology of cyESCs (CMK6) 
cultured with (left) or without (right) feeders. Scale bars, 200 1m. b, qPCR 
analysis of the expression of key markers in single-cell cDNAs generated 
from cyESCs (CMK6 and CMK9) cultured with or without feeders. The 
AC, values from the average C; value of GAPDH and PPIA are shown as 
heat maps. c, Summary of the SC3-seq samples of cyESCs and hiPSCs?°. 
The numbers of synthesized cDNAs of appropriate quality and of the 

cells analysed by SC3-seq are listed. d, UHC with all expressed genes 

(421 cells, 18,527 genes). Note that one cyESC is clustered with postL-EPI. 


Cy-Hs-Cj common orthologs; 628 / 776 genes 


e, Enrichment of Gene Ontology terms in genes upregulated in cyESC 
against pre-EPI (520 genes). f, UHC with expression of genes for ‘positive 
(left, 351 genes of Gene Ontology: 0008284) or negative (right, 391 genes 
of Gene Ontology: 0008284) regulation of cell proliferation. g, Heat map 
of the expression of cyEPI ontogenic genes among the indicated cells, 
including those reported by others”? **7?~*8, The genes in common 

for all platforms were used (628/776 genes). N, ‘naive’; C, conventional. 

h, Heat map of the correlation coefficients among cells as in g. Correlation 
coefficients were calculated using the averaged expression levels 

of genes in g. 
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Extended Data Figure 10 | See next page for caption. 
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Extended Data Figure 10 | Correlations among mPSCs and cells during 
cyEPI and mEPI development. a, Morphology of mESCs with 2i+L 

and day 2 epiblast-like cells (EpiLC) induced from the mESCs™. Scale bars, 
200m. b, Summary of the SC3-seq samples of mESCs and epiblast-like 
cells. The numbers of synthesized cDNAs of appropriate quality and of 

the cells analysed by SC3-seq are listed. c, qPCR analysis of the expression 
of key markers in single-cell cDNAs generated from 2i+L mESCs and 
epiblast-like cells. The AC, values from the C; values of Arbp are shown as 
heat maps and are used for clustering. d, UHC of cells from E4.5, E5.5, and 
E6.5 embryos, 2i+L mESCs, and epiblast-like cells with all expressed 
genes (108 cells, 14,628 genes). Colour bars under the dendrogram 
indicate the cell types. e, PCA of mEPI, primitive endoderm/visceral 
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endoderm, 2i+L mESCs, and epiblast-like cells by all expressed genes 
among these cells (89 cells, 14,341 genes). f, Scatter-plot comparisons 

of the averaged gene-expression levels between E4.5 EPI and 2i+L 
mESCs (left), and between E5.5 mEPI and epiblast-like cells (right). 

Key genes are annotated and the numbers of DEGs are indicated. The 
correlation coefficient is indicated above the scatter plots. g, Heat map of 
the expression of monkey and mouse common EPI genes (473 genes, as in 
Extended Data Fig. 8b) in cells during cyEPI and mEPI development and 
in cyPSCs and mPSCs. h, Heat map of the correlation coefficients among 
cells as in g. Correlation coefficients were calculated using the averaged 
expression levels of genes in g. 
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Tumour hypoxia causes DNA hyper- 
methylation by reducing TET activity 


Bernard Thienpont!-*, Jessica Steinbacher**, Hui Zhao!?*, Flora D’ Anna!?*, Anna Kuchnio!*, Athanasios Ploumakis°, 

Bart Ghesquiére', Laurien Van Dyck!?, Bram Boeckx!?, Luc Schoonjans!*, Els Hermans*, Frederic Amant®, 

Vessela N. Kristensen”®, Kian Peng Koh®, Massimiliano Mazzone, Mathew L. Coleman®, Thomas Carell*, Peter Carmeliet!* & 
Diether Lambrechts!? 


Hypermethylation of the promoters of tumour suppressor genes represses transcription of these genes, conferring 
growth advantages to cancer cells. How these changes arise is poorly understood. Here we show that the activity of 
oxygen-dependent ten-eleven translocation (TET) enzymes is reduced by tumour hypoxia in human and mouse cells. 
TET enzymes catalyse DNA demethylation through 5-methylcytosine oxidation. This reduction in activity occurs 
independently of hypoxia- associated alterations in TET expression, proliferation, metabolism, hypoxia-inducible factor 
activity or reactive oxygen species, and depends directly on oxygen shortage. Hypoxia-induced loss of TET activity 
increases hypermethylation at gene promoters in vitro. In patients, tumour suppressor gene promoters are markedly more 
methylated in hypoxic tumour tissue, independent of proliferation, stromal cell infiltration and tumour characteristics. 
Our data suggest that up to half of hypermethylation events are due to hypoxia, with these events conferring a selective 
advantage. Accordingly, increased hypoxia in mouse breast tumours increases hypermethylation, while restoration of 


tumour oxygenation abrogates this effect. Tumour hypoxia therefore acts as a novel regulator of DNA methylation. 


Although the mutagenic processes underlying oncogenesis are well 
studied, tumours are known to be not only genetically but also epige- 
netically distinct from their tissue of origin. The most extensively doc- 
umented examples of oncogenic epigenetic changes are those to DNA 
methylation, but the underlying mechanisms are poorly understood". 

In tumours, changes in DNA methylation involve both global 
hypomethylation and the local hypermethylation of CpG-rich gene 
promoters’. Hypermethylation frequently affects tumour suppressor genes 
(TSGs), downregulating their expression and thus contributing to 
oncogenesis. It remains unclear how methylation changes arise, but an 
instructive model suggests that genetic changes are a prerequisite for 
methylation changes’; BRAF mutations, for instance, lead to hypermeth- 
ylation in colorectal tumours’. This is problematic as, while pervasive, 
hypermethylation of TSGs can only be explained by somatic mutations 
in a fraction of tumours. Notably, extensive hypermethylation can be 
seen in ependymomas completely devoid of somatic mutations’. 

In contrast to DNA methylation mechanisms, those of demethyl- 
ation have remained elusive until recently, when TET methylcyto- 
sine dioxygenases (TET1, TET2 and TET3) were shown to oxidize 
5-methylcytosine (5mC) to 5-hydroxymethylcytosine (5hmC)?°. 
5-Hydroxymethylcytosine and its further-oxidized derivatives are sub- 
sequently replaced with an unmodified cytosine by base-excision repair 
to achieve demethylation®. Reduced 5mC oxidation due to decreased 
TET activity thus increases levels of DNA methylation. Mutations 
suppressing TET activity are often found in myeloid leukaemia and 
glioblastoma® ®, but less frequently in other cancer types. By contrast, 
5hmC loss is pervasive in tumours and even proposed as a cancer 
hallmark'®. As with hypermethylation, somatic mutations explain the 
loss of 5hmC in only a fraction of tumours and it remains unclear 
which other factors trigger this loss”. 


Notably, like hypoxia-inducible factor (HIF)-prolyl-hydroxylase 
domain proteins (PHDs), TET enzymes are Fe*t- and «-ketoglutarate- 
dependent dioxygenases'!. PHDs are oxygen-sensitive, acting 
as oxygen sensors. Under normoxic conditions, they hydroxylate the 
HIF transcription factors, targeting them for proteasomal degrada- 
tion, whereas under hypoxia they do not, leading to HIF stabilization 
and hypoxia response activation’. Expanding tumours continuously 
become disconnected from their vascular supply, resulting in vicious 
cycles of hypoxia, HIF activation and tumour vessel formation. 
Consequently, hypoxia pervades in solid tumours. Oxygen levels 
range from 5% to anoxia and around one-third of tumour areas con- 
tain less than 0.5% oxygen!“ Although DNA hypermethylation and 
hypoxia are well-recognized cancer hallmarks, the effect of hypoxia 
on TET hydroxylase activity and subsequent DNA de-methylation 
has not been assessed. We therefore set out to investigate whether 
a hypoxic micro-environment decreases TET hydroxylase activity 
in tumours, leading to an accumulation of 5mC and acquisition of 
hypermethylation. 


Effect of hypoxia on DNA hydroxymethylation 

To assess whether hypoxia affects TET activity, we exposed ten human 
and five murine cell lines with detectable 5hmC levels to 21% O2 
(normoxic) or 0.5% O (hypoxic, commonly observed in tumours!) 
for 24h. Hypoxia induction was verified and DNA was extracted and 
profiled for nucleotide composition using liquid chromatography- 
mass spectrometry (LC-MS). We observed 5hmC loss in eleven cell 
lines, including eight cancer cell lines (Fig. 1a). However, this did not 
translate into global 5mC increases (Extended Data Fig. 1), presumably 
because 5mC is more abundant and is not targeted by TETs at many 
sites’. The effect of hypoxia was concentration- and time-dependent: 
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3000 Leuven, Belgium. ‘Institute of Cancer and Genomic Sciences, University of Birmingham, Birmingham B15 2TT, UK. °Gynecologic Oncology, University Hospitals Leuven, Department of Oncology, 
KU Leuven, 3000 Leuven, Belgium. ’Department of Genetics, Institute for Cancer Research, Oslo University Hospital Radiumhospitalet, N-O310 Oslo, Norway. 8Department of Clinical Molecular 
Biology (EpiGen), Akershus University Hospital and Institute of Clinical Medicine, Faculty of Medicine, University of Oslo, Postboks 1171, Blindern 0318 Oslo, Norway. 2Department of Development and 
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Figure 1 | Effect of hypoxia on 5hmC in vitro. a, Levels of 5hmC (top), 
and overall TET expression (bottom) in cell lines grown for 24h under 
21% or 0.5% Oz. RNA expression is expressed relative to the combined 
estimated level of all 3 TET paralogues under 21% Op. b, c, 5nmC/C levels 
in MCF7 cells exposed to different O> levels for 24h (b), or 0.5% O> for 
indicated times (c). d, Correlation of changes in overall TET expression 
and 5hmC upon hypoxia. Each circle represents a cell line, the full line 
the correlation. e, f, Levels of 5hmC (e) and a-ketoglutarate (KG) (f) in 


a dose-response revealed loss of 5hmC at oxygen levels at-or-below 
2%, and a 20% and 40% reduction, respectively, after 15h and >24h 
of hypoxia (Fig. 1b, c). Loss of 5hmC was not due to increased 5hmC 
oxidation to 5fC’%, as hypoxia also decreased 5fC levels in embryonic 
stem (ES) cells (Extended Data Fig. 1). 

In some cell lines, levels of 5hmC failed to decrease under hypoxia. 
5hmC levels were unaffected in cell lines H1299 and 4T1, and even 
increased in SHSY5Y and SK-N-Be2c neuroblastoma cells, as 
reported previously!” (Fig. 1a). When profiling TET expression, 
neuroblastoma cells displayed potent hypoxia-dependent induction 
of TET1 and TET2. Cell lines H1299 and 4T1 exhibited intermediate 
increases in expression levels, whereas all other cell lines showed 
no, or modest, increases of some TET paralogues (Fig. 1a). Tet gene 
expression changes were confirmed at the protein level in mouse 
cell lines, and HIF1$-chromatin immune precipitation followed 
by sequencing (ChIP-seq) further confirmed that HIF binds near 
the promoters of upregulated Tet genes, but not near those that are 
unaltered (Extended Data Fig. 2a, b), in keeping with the cell-type 
specificity of the hypoxia response’. Notably, no cell line showed 
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MCF7 cells grown with ascorbate (e), water or dimethyl-ca-ketoglutarate 
(2me-aKG) (f) under 21% (white) or 0.5% (red) O2. a-ketoglutarate 
changes are relative to matching water controls. g, As in a, but for cells 
exposed to IOX2. h, i, Michaelis-Menten curve of Tet1 (h) and Tet2 

(i, n =3) for Oz. Km denotes Michaelis constant. Error bars denote s.e.m., 
grey areas: 95% confidence interval, n =5 replicates for a-h, *P < 0.05, 
**P < 0.01, ***P < 0.001 by t-test (b, c, e) or analysis of variance 
(ANOVA) with post-hoc Tukey HSD test (f). 


decreased Tet expression, indicating that 5hmC loss is not due to 
reduced Tet expression. 

Since hypoxia affects TET paralogue expression differently in differ- 
ent cell lines we correlated hypoxia-associated changes in overall TET 
expression (the combined abundances of TET1, TET2 and TET3) with 
changes in 5hmC levels. Hypoxia reduced ShmC levels by an average 
of 44% (P=0.0097) in each cell line (Fig. 1d), independently of TET 
expression changes. Nevertheless, changes in TET expression also 
affected 5hmC levels. This was confirmed by short interfering RNA 
(siRNA) knockdown of TET2, which constitutes around 60% of all 
TET expression in MCF7 cells. This reduced 5hmC levels by around 
60% (Extended Data Fig. 2c). Similarly, Tet! knockout mouse ES cells 
(Tet1~'-) displayed lower 5hmC levels than wild-type ES cells, in 
which Tet! is the predominantly expressed Tet paralogue under both 
normoxic and hypoxic conditions (Fig. 1a, Extended Data Fig. 2d). 

Post-hypoxic 5hmC levels therefore appear to be determined by 
altered oxygen availability and by changes in TET abundance. This 
explains why cell lines that do not upregulate TET expression in 
response to hypoxia display ShmC loss, whereas cell lines that strongly 
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upregulate TET compensate for this, resulting in equal or increased 
5hmC levels. 


Hypoxia directly affects DNA hydroxymethylation 

Aside from gene expression, TET activity is affected by a variety of 
cellular processes, including changes in levels of reactive oxygen spe- 
cies (ROS), Krebs cycle metabolites and proliferation”!?!”!8, Since 
such changes might also occur secondary to hypoxia, we investigated 
whether they underlie 5hmC reductions in hypoxia. 

First, we assessed whether ROS could affect TETs in the nucleus 
through inactivation of Fe** in their catalytic domain. Although 
hypoxia increased overall ROS levels, no increase in nuclear ROS was 
detected either by a nucleus-specific ROS probe or through 8-oxog- 
uanine (8-oxoG) quantification (Extended Data Fig. 3a-f). Moreover, 
ascorbate supplementation to counteract ROS increases!” failed to 
rescue ShmC loss (Fig. le). 

Second, because changes in metabolites such as succinate and fumarate 
compete with TET for its cofactor a-ketoglutarate’, we investigated 
whether this was relevant. The concentration of these metabolites, 
however, was not increased in hypoxic MCF10A or embryonic stem 
(ES) cells, and only 3-4-fold in MCF7 cells (Extended Data Fig. 3g-i). 
Levels of the onco-metabolite 2-hydroxyglutarate were also increased 
in hypoxic MCF7 and MCF10A cells, but were still only around 5-10% 
of «KG (Extended Data Fig. 3h, j), and therefore unlikely to affect TET 
activity, as affinity of these competing metabolites for hydroxylases is 
lower or similar to aKG””°. Culturing MCF7 cells in glutamine-free 
medium to decrease the concentration of these metabolites did not alter 
5hmC levels (Extended Data Fig. 3k). Similarly, exogenous addition of 
cell-permeable «KG under hypoxia to counteract putative competing 
metabolites did not rescue the 5hmC loss (Fig. 1f). This therefore 
precluded metabolite competition from causing hypoxia-associated 
5hmC loss. 

Third, increases in cell proliferation have also been linked to 54mC 
loss”!. However, cell growth was unaffected or decreased upon exposure 
to hypoxia in all cell lines tested, indicating that increased proliferation 
does not underlie 5hmC reduction (Extended Data Fig. 31). 

Fourth, to exclude any potential cellular changes caused by HIF acti- 
vation, we pharmacologically activated the hypoxia response program 
by exposing five cell lines grown in atmospheric conditions to IOX2, 
a small molecule inhibitor with high specificity for PHDs”” (Extended 
Data Fig. 3m). Cell lines not characterized by hypoxia-induced TET- 
expression changes (MCF10A, A549 and MCE7) showed no change in 
5hmC levels under IOX2, while those characterized by TET upregula- 
tion (SK-N-Be2c and SHSY5Y) showed an increase in 5hmC (Fig. 1g). 
Thus, after IOX2 exposure, changes in 5hmC levels mirrored changes 
in TET transcription. We also prepared nuclear protein extracts 
from MCEF7 cells grown under hypoxic and atmospheric conditions, 
and then compared their 5mC oxidative capacities at the same oxy- 
gen tension in vitro. These, however, were identical (Extended Data 
Fig. 3n). Loss of 5hmC was therefore not due to activation of the 
hypoxia response program. 

Finally, we assessed the effect of varying oxygen concentrations on 
the activity of recombinant purified Tet] or Tet2, by measuring conver- 
sion of 5mC to 5hmC on double-stranded genomic DNA. We observed 
a dose-dependent reduction in 5hmC production with decreasing 
concentrations. Importantly, under the hypoxic conditions applied in 
this study (0.5% Oz), Tetl and Tet2 activity were reduced by 45% +7 
and 52% +8 (mean + s.e.m., P=0.01; Fig. 1h, i). 

Together, these data demonstrate that decreased oxygen availability 
directly diminishes the oxidative activity of TETs, independently of 
changes in HIF activity, competing metabolites, proliferation, nuclear 
ROS or TET expression. 


Loci with differential DNA hydroxymethylation 
To analyse where in the genome hypoxia reduces 54mC. DNA from 
hypoxic and normoxic MCF7 cells was immunoprecipitated using 
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Figure 2 | Genomic profiles of 5(h)mC in MCE7 following hypoxia. 
a, Changes in 5hmC at 290,382 peaks detected using 5omC-DIP-seq. 
Peaks gaining (red) and losing (blue) ShmC are highlighted at P< 0.05 
and 5% FDR adjustment (lighter and darker). b, Observed/expected 
fraction of 5hmC peaks overlapping with chromHMM chromatin states 
either exhibiting hypoxia-associated 5hmC loss (n = 10,001, blue) or 
not (n = 280,381, grey). c, d, Changes in 5mC after 24h (c) or 48h (d) of 
0.5% Oz, assessed by 5mC-DIP-seq at 10,001 hypohydroxymethylated 
peaks upon hypoxia (c) or by BS-seq at 1,894 regions capture-selected 
using SeqCapEpi (d; see Methods). e, Expression changes of genes 

in hypohydroxymethylated, and both hypohydroxymethylated and 
hypermethylated peaks. Plots depict 3 (a, e), 4 (c) or 5 (d) replicates, 
*** P< (0.001 by negative binomial generalized linear models (a, c), 
Fisher’s exact test (d) or t-test (e). 


antibodies targeting 5mC or ShmC and subjected to high-throughput 
sequencing (DIP-seq). We detected 290,382 sites enriched for 5hmC. 
After hypoxia, 10,001 of the peaks generated for each site exhibited 
a decrease in 5hmC (5% false discovery rate (FDR)) and only 18 
exhibited an increase, thereby confirming global 5hmC loss (Fig. 2a, 
Supplementary Table 1). Genomic annotation of these peaks using 
chromHMM” revealed they were predominantly found at gene pro- 
moters, with some at enhancers and actively transcribed regions, in 
line with known TET-binding sites!> (Fig. 2b). For example, 54mC 
was decreased near transcription start sites of TSGs NSD1, FOXA1 and 
CDKN2A (Extended Data Fig. 4). Analysis of 5mC-DIP signals at these 
10,001 regions highlighted that, in 724 out of 875 altered regions, the 
5mC content was significantly increased (P < 0.05), although only one 
of these sites survived a 5% FDR correction (Fig. 2c, Supplementary 
Table 2). Increases in 5mC were thus more subtle than decreases 
observed for ShmC. 

Several days may be required for 5hmC changes to translate into 5mC 
changes!”. We therefore cultured cells for 48h (rather than 24h) under 
hypoxia, and used targeted bisulfite-sequencing (BS-seq) to obtain 
base-resolution quantitation of 5mC at around 85 Mb of promoters 
and enhancers. Using this approach, we could assess increases in 5mC 
for 1,894 of the 10,001 regions displaying 5hmC loss. As observed after 
5mC-DIP-seq, 301 out of 402 altered sites displayed increased meth- 
ylation (P < 0.05). Similarly, 60 out of 99 altered sites were increased 
with 5% FDR correction (P=2.8 x 107°; Fig. 2d, Supplementary 
Table 3). ChromHMM annotation revealed that these 60 sites were 
predominantly in gene promoters and enhancers. To assess the effect 
of hypermethylation on gene expression, we performed RNA sequenc- 
ing (RNA-seq) on hypoxic MCF7 cells. Genes depleted in 5hmC and 
with increased 5mC showed significantly decreased expression upon 
hypoxia (Fig. 2e; P=2.5 x 10~*7 and 7.4 x 1074, respectively, for 3,660 
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Figure 3 | Effect of hypoxia on hypermethylation in TCGA. 

a, Observed and expected number of hypoxic versus normoxic tumours in 
3 methylation clusters for 1,000 CpGs hypermethylated in tumour versus 
normal tissue. b, Percentage of hypermethylation events in promoters of 
frequently hypermethylation genes. n = 3,141 tumours, *P < 0.05, 

**P < 0.01, ***P < 0.001 by Cochran—Armitage (a), generalized linear 
model per tumour type corrected for co-variants (Supplementary Table 8) 
(b). BLCA, bladder carcinoma; BRCA, breast carcinoma; COAD, colorectal 
adenocarcinoma; HNSC, head and neck squamous cell carcinoma; KIRC, 
kidney renal clear cell carcinoma; LUAD, lung adenocarcinoma; LUSC, 
lung squamous cell carcinoma; UCEC, uterine corpus endometrial 
carcinoma. 
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genes with 5hmC loss and 55 genes with both ShmC loss and 5mC gain; 
Supplementary Table 4). Reduced TET activity therefore leads to an 
accumulation of 5mC, decreasing the expression of associated genes. 


Hypermethylation events in hypoxic tumours 

We next assessed whether 5hmC loss and concomitant 5mC gain 
also occur in vivo. We focused on gene promoters as they are more 
frequently affected upon hypoxia, and directly linked to gene expression. 
As cancer cells go through multiple rounds of sustained hypoxia’, we 
proposed that there would be an increase in 5mC, as it would provide 
a selective advantage for cancer cells, similar to somatic mutations. We 
therefore assessed 5hmC levels in three patient-derived tumour xeno- 
grafts, in which we marked hypoxic areas with pimonidazole (Extended 
Data Fig. 5a). Immunofluorescence analysis revealed decreased 5hmC 
in hypoxic areas, linking tumour hypoxia to 5hmC loss in vivo. 

To assess whether hypoxia-associated hypermethylation contrib- 
utes to the oncogenic process, we analysed tumours profiled in the 
pan-cancer study of The Cancer Genome Atlas (TCGA)™. We selected 
8 solid tumour types (3,141 tumours) for which both DNA methylation 
(450k array) and gene expression (RNA-seq) data were available for 
>100 samples, and classified each as hypoxic, normoxic or interme- 
diate using an established gene signature’® (Extended Data Fig. 5b). 
Next, we analysed tumour-associated DNA hypermethylation in each 
tumour type by performing unsupervised clustering of 1,000 CpGs that 
displayed the strongest hypermethylation in tumour versus normal 
tissue (Extended Data Fig. 5c). In the first three clusters (displaying 
low, intermediate and high average hypermethylation), we analysed the 
enrichment of hypoxic tumours. For all eight tumour types, hypoxic 
tumours predominated in the hypermethylated cluster and normoxic 
tumours in the hypomethylated cluster (Fig. 3a; P=2 x 107“), suggest- 
ing that hypoxia leads to increased methylation in tumours. 

Whereas the prior analysis identified uniform increases in methyla- 
tion based on average changes, it poorly captured exceptional increases 
in hypermethylation known to occur in a subset of tumours’. We 
therefore also modelled tumour hypermethylation by annotating 
increases in CpG methylation at gene promoters using a stringent 
threshold (Bonferroni-corrected P < 0.05) as hypermethylation events. 
In each tumour type, the promoters of 187 + 38 out of 29,649 genes 
frequently displayed hypermethylation events (Supplementary Table 5). 
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Notably, hypoxic tumours had on average 4.8-fold more hyper- 
methylation events in these genes than normoxic tumours (Fig. 3b; 
P=4.1 x 107'3). These events were functional, reducing gene expres- 
sion in tumours carrying these hypermethylation events (Extended 
Data Fig. 5d). They primarily affected promoters with high or inter- 
mediate CpG content, in line with TET target preference (Extended 
Data Fig. 5e)!°. Furthermore, they were not restricted to a small subset: 
77% £6.5, 49% + 9.3 or 39% +9.1 of hypoxic tumours were affected 
by >1, >10 or >20 hypermethylation events, respectively. Considering 
hypermethylation frequency in normoxic tumours as baseline, up to 
48% of hypermethylation events were hypoxia-related. 

As hypermethylation can also be genetically encoded, mutations in 
some genes correlated positively with hypermethylation (for example, 
IDH1, TET1, TET3 and BRAF; Supplementary Table 6). Importantly, 
hypoxia predicted hypermethylation independent of mutation status 
(P=6.1 x 107!*), Mutations inhibiting TET activity were infrequent 
(approximately 1.8% of tumours), indicating that hypermethylation is 
not genetically encoded in most tumours. TET-mutant tumours were 
also not more hypoxic, suggesting that hypoxia induces hypermethyl- 
ation, and not vice versa (Extended Data Fig. 5f). Hypoxia-associated 
hypermethylation events occurred independently of other tumour 
characteristics such as tumour cell percentage, immune cell infiltra- 
tion, tumour size, proliferation or metastasis (P=4 x 10~!3), and were 
significant in seven out of eight tumour types (Supplementary Tables 7, 8). 
In line with an earlier report”!, high proliferation was the only other 
variable significantly predicting hypermethylation (P=5.3 x 107°), 
although only in four of eight tumour types (Extended Data Fig. 5g, h). 
Using multiple regression, we estimated the contribution of tumour 
characteristics to hypermethylation variance. On the basis of partial 
correlation coefficients, proliferation predicted 12.1% +4.1, and 
hypoxia 33.3% + 5.7, of hypermethylation events explained by the 
model (Extended Data Fig. 5i). 

Given the increase in hypermethylation events in hypoxic tumours, 
we next selected genes with more hypermethylation events in hypoxic 
versus normoxic tumours (5% FDR). This revealed 263 + 94 genes 
per tumour type, with 9.0% + 1.6 being shared between any two types 
(Supplementary Table 9). Ontology analysis of hypermethylated genes 
revealed that they had biological processes in common such as cell 
cycle arrest, DNA repair and apoptosis. Hypermethylation was also 
observed in genes involved in suppressing glycolysis, angiogenesis and 
metastasis, consistent with tumour hypoxia inducing these processes 
(Extended Data Fig. 6a-c). 


Reduced TET activity underlies hypermethylation 
We used three strategies to confirm the role of TET activity in hypoxia- 
associated hypermethylation. First, we correlated TET expression with 
hypermethylation events, correcting for hypoxia and proliferation. 
TET2 and TET3 expression inversely correlated with hypermethylation 
(P= 0.046 and 0.0028, Extended Data Fig. 7a), as did hypoxia and 
proliferation (P< 1.2 x 107! for both). Similar to our in vitro observa- 
tions, this implicates reduced TET activity in hypermethylation. 
Second, we assessed the overlap of hypermethylation events induced 
by hypoxia and IDH1"! mutations® in 63 glioblastomas. Among 
wild-type IDH1 glioblastomas, hypermethylation frequency was 3.4- 
fold higher in hypoxic tumours (Fig. 4a, Extended Data Fig. 7b). As 
expected, IDH1®!*? tumours were hypermethylated, albeit 3.9-fold 
more so than hypoxic tumours (Fig. 4a). This indicates that TET 
enzymes, fully inactivated in IDH-mutant tumours’, were only partially 
inactivated in hypoxia, similar to our in vitro observations. Of 228 genes 
frequently hypermethylated in glioblastomas, those in the hypoxic and 
IDH-mutant subgroups displayed a 58% overlap (P < 10~'°; Fig. 4b) 
and reduced expression (Extended Data Fig. 7c), indicating that loss of 
TET activity affects the same genes, regardless of the underlying trigger. 
Finally, to link hypoxia-associated hypermethylation to 5hmC loss, 
we profiled 24 non-small-cell lung tumours for 5mC and 5hmC using 
450k arrays (Extended Data Fig. 7d). This revealed a generalized loss 
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Figure 4 | Effect of hypoxia on TET activity in human tumours. 
a, Hypermethylation in 19 normoxic (blue), 21 intermediate (grey), 
17 hypoxic (red) and 6 IDH1""*?-mutated (yellow) glioblastomas. 
b, Overlap between genes hypermethylated in hypoxic versus IDH 
mutated glioblastomas. c, 5hmC measured across 485,000 CpGs in 12 
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of 5hmC in hypoxic tumours (—7.1% + 1.1; P=3.7 x 1073; Fig. 4c). 
Individual probes also mostly displayed 5hmC loss and 5mC gain in 
hypoxic tumours (96.7% and 65.4% of probes altered, respectively, 
P<0.01; Supplementary Table 10). Of all probes displaying 5mC gain, 
most (87%) also displayed 5hmC loss, and of probes altered both in 
5hmC and 5mC (P< 0.01), 92% showed ShmC loss and 5mC gain 
(Fig. 4d; P< 10~'*). This directly implicates hypoxia-induced loss of 
5hmC in the hypermethylation of hypoxic tumours. 


Rescue of hypoxia- induced hypermethylation 

To manipulate tumour oxygenation and confirm its effect on hyper- 
methylation, we used mice expressing the polyomavirus middle 
T-antigen under the mouse mammary tumour virus promoter 
(MMTV-PyMT). These mice spontaneously develop breast tumours, 
with hypoxic areas emerging from 7 weeks onwards, encompassing 
approximately 20% of the tumour at 16 weeks”’. Hypoxic areas in these 
tumours were also depleted in 5hmC (Fig. 5a, b). 

We monitored hypermethylation changes by targeted BS-seq of TSG 
promoters commonly inactivated in cancer”*. Hypoxic human breast 
tumours displayed a specific increase in hypermethylation at these TSG 
promoters, whereas no effect was observed for oncogenes (Extended 
Data Fig. 8a). In line with the age-associated increase in tumour 
hypoxia’’, hypermethylation events also increased markedly with age 
(and tumour size), but not in normal mammary glands (Extended 
Data Fig. 8b-d). Importantly, >95% of cells in these tumours were 
PyMT-positive, whereas cell proliferation and immune cell infiltration 
were comparable between hypoxic and normoxic areas (Extended Data 
Fig. 8e-g). Hypermethylation changes are therefore unlikely to be a 
result of changes in proliferation or cellular heterogeneity. 

To test whether reduced tumour oxygenation increases hypermeth- 
ylation, 9-week-old MMTV-PyMT mice were hydrodynamically 
injected with a soluble-Flk1 (sFIk1)-expressing plasmid. After 3 weeks, 
this caused tumour vessel pruning and hypoxia (Extended Data 
Fig. 9a-d). Shallow whole-genome sequencing for 5hmC (TET-assisted 
bisulfite sequencing; TAB-seq) revealed a global loss of 5hmC after 
sFlk1 overexpression (—12.4% + 3.5, P=0.040), occurring predom- 
inantly at gene-dense regions and affecting the entire gene (Fig. 5c, 
Extended Data Fig. 9e), consistent with previously described 5hmC 
distributions'°. Moreover, targeted BS-seq revealed an exacerbated hyper- 
methylation phenotype after sFlk1 overexpression at 12 weeks in TSGs 
but not oncogenes (10 out of 15 TSGs contained >1 hypermethylation 
event; P=0.010, Fig. 5d, Extended Data Fig. 9f). Tumour growth and 
the expression of proliferation markers, Tet paralogues and the immune 
cell marker CD45 were unaffected by sFIk1 overexpression, indicating 
that hypermethylation occurs independently (Extended Data Fig. 9g-)). 

To rescue this effect, we normalized the tumour vasculature by 
intercrossing a heterozygous Phd2 (also known as Egln1) loss-of- 
function allele with the PyMT transgene. This significantly reduced 
tumour hypoxia at 16 weeks?” (Extended Data Fig. 9k). TAB-seq 
revealed a ShmC gain (+25.3% + 4.7, P=0.0098) occurring primarily 
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normoxic versus 12 hypoxic non-small-cell lung tumours. d, Changes in 
5(h)mC for unaltered CpGs (grey), and CpGs altered in both 5mC and 
5hmC (25% FDR, blue; P< 0.01, red). ***P < 0.001 by Fisher’s exact (a), 
**P < 0.01 by f-test (c). 


at gene-dense regions and affecting the entire gene (Fig. 5c, Extended 
Data Fig. 91). Notably, BS-seq revealed that, although 8 out of 15 TSGs 
displayed >1 hypermethylation event in Phd2*'* tumours, no hyper- 
methylation was observed in Phd2*’~ tumours (P= 2.6 x 10’, Fig. 5e). 
Again, oncogenes were unaffected (Extended Data Fig. 9m). Effects 
were independent of Phd2 haplodeficiency in tumour cells, as similar 
effects were observed in PyMT mice having endothelial-cell-specific 
Phd2 haplodeficiency”” (Extended Data Fig. 9n, 0). As in the sFIk1 
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Figure 5 | Effect of vessel pruning and normalization on 5hmC and 

TSG hypermethylation. a, b, Immunofluorescence of breast tumours in 
transgenic (MMTV-PyMT) mice. a, Representative image. Scale bar, 50 jum. 
b, Box plot of ShmC signal in >150 PyMT-positive nuclei from eight 
tumours, stratified for pimonidazole (PIMO) (yes/no) and normalized to 
PIMO-negative nuclei. c, SamC levels + s.e.m. across a metagene in tumours 
of 12-week-old mice receiving empty or sFlk1-overexpressing plasmid 

(left, n = 3), or 16-week-old mice with the indicated genotype (right, n =3 for 
Phd2*!+; n=4 for Phd2*’-). d, e, Hypermethylation in tumours developing 
in 12-week-old mice receiving empty (m= 19) or sFlk1-overexpressing 
plasmid (n= 24) 3 weeks earlier (d), and in tumours developing in 
16-week-old Phd*/~ (n= 10) and Phd*'*(n=9) mice (e). Plotted are z-scores 
of hypermethylation, relative to normoxic tumours (empty and Phd2*/~ for 
dand e). Dotted line: 5% FDR, darker dots: significant hypermethylation. 
Brcal and Timp3: not shown (no hypermethylation event detected). 
Hypermethylated genes on average had 5.8% (d) and 4.7% (e) more 
methylation. *P < 0.05, **P<0.01, ***P< 0.001 by t-test. 
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model, increasing tumour oxygenation by Phd2 haplodeficiency did 
not affect tumour growth or the expression of proliferation markers, 
Tet paralogues or CD45 (Extended Data Fig. 9p-u). 


Discussion 

We show here that tumour hypoxia directly reduces TET activity, causing 
a 5hmC decrease predominantly at gene promoters and enhancers. 
Concomitantly, 5mC increases at these sites and, as with certain genetic 
mutations, provides a substrate for oncogenic selection in vivo7®. Since 
hypoxia prevails in tumours, 5mC changes in TSG promoters are 
frequent, rendering hypoxic tumours hypermethylated at these sites. 
Hypermethylation events in tumours have long been suspected to occur 
through selection of random DNA methylation variants”. However, 
the identification of genetically encoded hypermethylation challenged 
this stochastic model”. By demonstrating that hypoxia drives hyper- 
methylation, we show that genetically-encoded and tumour-microen- 
vironment-driven models of epimutagenesis co-exist. However, since 
hypoxia is pervasive, the mechanism described here is relevant for 
most solid tumours. We found that up to 48% of hypermethylation 
events were hypoxia-related and effects were replicated in all tumour 
types investigated, independent of mutation- and proliferation- 
induced hypermethylation. Modest hypoxia (2-5% Oz) did not affect 
TET activity, indicating that TET enzymes are not physiological oxygen 
sensors (unlike PHDs) in line with previous reports*”. TET activity only 
becomes limiting under the pathophysiological oxygen concentrations 
found in tumours!*. Analogous to somatic TET haploinsufficiency, 
this partial reduction in TET activity contributes to oncogenesis. Our 
findings also suggest intriguing avenues of investigation into other 
ischaemia-related pathologies. 

Our model provides a mechanism for the association between 
hypoxia and maladaptive oncogenic processes. Genes affected by hyper- 
methylation were not only involved in cell-cycle arrest, DNA repair and 
apoptosis, but also glycolysis, metastasis and angiogenesis. High doses of 
angiogenesis inhibitors stimulate metastatic spreading in mouse cancer 
models (at least in specific settings)*!, and tumour hypoxia is considered 
a driver of this behaviour. The mechanism by which hypermethylation 
accumulates under hypoxia may underlie these escape mechanisms. By 
contrast, low levels of angiogenic inhibition can induce tumour vessel 
normalization, and improve oxygenation*”. Our observations in nor- 
malized PyMT tumours suggest that the therapeutic benefits of vessel 
normalization such as decreased metastatic burden?’, might occur by 
inhibiting hypoxia-associated hypermethylation. Countering hyper- 
methylation by inhibiting DNA methylation or by normalizing tumour 
blood supply may therefore prove to be therapeutically beneficial. 


Online Content Methods, along with any additional Extended Data display items and 
Source Data, are available in the online version of the paper; references unique to 
these sections appear only in the online paper. 
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METHODS 


Materials. All materials were molecular biology grade. Unless noted otherwise, 
all were from Sigma. 

Analysis of global 5mC and 5hmC levels in cultured cells. MCF7, MCF10A, 
A549, H1299, SHSY5Y, Hep G2, Hep 3B2, HT-1080, NCI-H358, LLC, Neuro-2a, 
4T1 and SK-N-Be2c cell lines were obtained from the American Type Culture 
Collection and their identity was not further authenticated. These cell lines are not 
listed in the database of commonly misidentified cell lines maintained by ICLAC. 
LLC, Neuro-2a, 4T1, Hep G2, HT-1080, Hep 3B2, MCF7 and A549 cells were cul- 
tured at 37°C in DMEM with 10% fetal bovine serum (FBS), 5ml of 100 Uml! 
penicillin-streptomycin (Life Technologies) and 5 ml of L-glutamine 200 mM. 
NCI-H358, H1299 and SK-N-Be2c cell lines were cultured at 37°C in RPMI 
1640 Medium with 10% FBS 1% penicillin-streptomycin and 1% L-glutamine. 
MCFIOA cells were cultured at 37°C in DMEM/F-12 supplemented with 5% horse 
serum (Life Technologies), 20 ng ml! human epidermal growth factor (Prepotec), 
0.54g ml! hydrocortisone, 100 ng ml! cholera toxin, 101g ml“! insulin, and 
100 U ml! penicillin-streptomycin. The SHSY5Y cell line was cultured at 37°C 
in DMEM/F-12 supplemented with 10% FBS, 2% penicillin-streptomycin and 1% 
non-essential amino acids (MEM). Mouse J1 ES cells were cultured feeder-free in 
fibroblast-conditioned medium. Cell cultures were confirmed to be mycoplasma- 
free every month. 

Cell line treatment conditions. Control cell cultures were grown at atmospheric 
oxygen concentrations (21%) with 5% CO. To render cultures hypoxic, they were 
incubated in an atmosphere of 0.5% O», 5% CO and 94.5% N». Where indicated, 
IOX2 (501M), ascorbate (0.5 mM, a dose known to support TET activity!”) or 
dimethyl-ca-ketoglutarate (0.5 mM) was added to fresh culture medium, using an 
equal volume of the carrier (DMSO) as a control for IOX2. Cells were plated at 
a density tailored to reach 80-95% confluence at the end of the treatment. Fresh 
medium was added to the cells just before hypoxia exposure. For glutamine-free 
culture experiments, dialysed FBS was added to glutamine-free DMEM, and 
supplemented with glutamine (4mM) for the control. Mouse J1 ES cells and 
Tet1-gene-trap ES cells were cultured feeder-free in fibroblast-conditioned medium. 
DNA extraction. After exposure to the aforementioned stimuli, cultured cells were 
washed on ice with ice-cold PBS with deferoxamin (PBS-DFO, 2001M), detached 
using cell scrapers and collected by centrifugation (400g, 4°C). Nucleic acids were 
subsequently extracted using the Wizard Genomic DNA Purification kit (Promega) 
according to instructions. All buffers were supplemented with DFO (2001M) and 
DNA was dissolved in 8011 PBS-DFO with RNase A (200 U, NEB) and incubated 
for 10 min at 37°C. After proteinase K addition (200 units) and incubation for 
30 min at 56°C, DNA was purified using the QIAQuick blood and tissue kit (all 
buffers supplemented with DFO). It was eluted in 100 il of a 10 mM Tris, 1 mM 
EDTA solution (pH 8) and stored at —8°C until further processing. 
LC-ESI-MS/MS of DNA to measure 5mC, 5hmC and 8-oxoG levels. To measure 
the cytosine, 5mC, 5hmC and 8-oxoG content of the DNA samples, three techni- 
cal replicates were run for each sample. More specifically, 0.5-2 jug DNA in 2511 
H,0 were digested in an aqueous solution (7.511) of 480j1M ZnSO,, containing 
42 U nuclease S1, 5 U Antarctic phosphatase, and specific amounts of labelled 
internal standards were added and the mixture was incubated at 37°C for 3h in 
a Thermomixer comfort (Eppendorf). After addition of 7.5 jul of 520M [Na]>- 
EDTA solution containing 0.2 U snake venom phosphodiesterase I, the sample 
was incubated for another 3h at 37°C. The total volume was 40 il. The sample was 
then kept at —20°C until the day of analysis. Samples were then filtered by using 
an AcroPrep Advance 96-filter plate 0.2 1m Supor (Pall Life Sciences) and then 
analysed by liquid chromatography electrospray ionization tandem mass spec- 
trometry (LC-ESI-MS/MS), which are performed using an Agilent 1290 UHPLC 
system and an Agilent 6490 triple quadrupole mass spectrometer coupled with 
the stable isotope dilution technique. DNA samples were digested to give a nucle- 
oside mixture and spiked with specific amounts of the corresponding isotopically 
labelled standards before LC-MS/MS analysis. The nucleosides were analysed in 
the positive ion selected reaction monitoring mode (SRM). In the positive ion 
mode, [M + H]* species were measured. 

Determination and comparison of nucleoside concentrations. The resulting 
cytosine, 5mC, 5hmC and 8-oxoG peak areas were normalized using the isotop- 
ically labelled standards, and expressed relative to the total cytosine content (that 
is, C+ 5mC + 5hmC). Concentrations were depicted as averages of independent 
replicates grown on different days, and compared between hypoxia and normoxia 
(21% Oz), or between control and treated conditions, using a paired Student’s t-test. 
No statistical methods were used to predetermine sample size. 

RNA extraction, cDNA synthesis and qPCR. For RNA extraction, cell culture 
medium was removed, TRIzol (Life Technologies) added and processed according 
to manufacturer's guidelines. Reverse transcription and qPCR were performed 
using 2 x TaqMan Fast Universal PCR Master Mix (Life Technologies), TaqMan 
probes and primers (IDT, sequence in Supplementary Table 12). Thermal cycling 
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and fluorescence detection were done using a LightCycler 480 Real-Time PCR 
System (Roche). Taqman assay amplification efficiencies were verified using serial 
cDNA dilutions, and estimated to be >95%. 

mRNA concentration analysis and statistics. Cycle threshold (C;) values were 
determined for each sample and gene of interest in technical duplicates, and nor- 
malized according to the corresponding amplification efficiency. Per sample, TET 
expression was expressed relative to 3-2-microglobulin (human) or hypoxanthine 
phosphoribosyltransferase 1 (Hprt mouse) levels by subtraction of their average C, 
values. Concentrations were expressed as averages of at least 5 replicates extracted 
on different days. For Fig. la, copy number estimates for TET1, TET2 and TET3 
were expressed for each cell line, relative to the summed copy number estimates 
of TET1, TET2 and TET3 under control conditions (21% O ). Concentrations 
were compared between hypoxia and normoxia, or between control and treatment 
conditions using a Student's t-test. No statistical methods were used to predeter- 
mine sample size. 

Hypoxia marker gene induction. To verify further induction of the hypoxia 
response program, hypoxia marker gene expression was verified. We analysed 
mRNA levels of genes encoding the E1B 19K/Bcl-2-binding protein Nip3 (BNIP3) 
and fructose-bisphosphate aldolase (ALDOA), 2 established hypoxia marker 
genes**. Reverse transcriptase-quantitative PCR (RT-qPCR) was performed as 
described for the TET mRNA concentration assays, and differential expression was 
calculated using the AA C, method™. We ruled out transcriptional upregulation 
as the cause of the increase in HIF1a protein concentrations by assessing HIFIA 
mRNA expression in parallel. mRNA concentrations were expressed relative to 
normoxic controls (21% O2). Differences in mRNA concentration were assessed 
using a Student's t-test on 5 or more independent replicates grown on different 
days. 

Western blotting for Hifla, Tet1, Tet2 and Tet3. To assess Hifla protein stabili- 
zation, proteins were extracted from cultured cells as follows: cells were placed on 
ice, and washed twice with ice-cold PBS. Proteins were extracted with extraction 
buffer (50mM Tris HCl, 150mM NaCl, 1% Triton X-100, 0.5% sodium deoxy- 
cholate and 0.1% SDS) with 1x protease inhibitor cocktail. Protein concentra- 
tions were determined using a bicinchoninic acid protein assay (BCA, Thermo 
Scientific) following the manufacture’s protocol. An estimated 60\1g protein 
was loaded per well on a NuPAGE Novex 3-8% Tris-Acetate Protein gel (Life 
Technologies), separated by electrophoresis and blotted on polyvinylidene fluoride 
membranes. Membranes were activated with methanol, washed and incubated with 
antibodies targeting (3-actin (4967, Cell Signaling), Tet] (09-872, Millipore) and 
Tet3 (61395, Active Motif), at 1:1,000 dilution, targeting Tet2 (124297, Abcam) at 
1:250 dilution, and targeting Hifla (C-Term) (Cayman Chemical Item 10006421) 
at 1:3,000 dilution. Secondary antibodies and detection were according to routine 
laboratory practices. Western blotting was performed on 6 independent replicates 
grown on different days. 

Analysis of HIF16 target genes using ChIP-seq. To confirm that hypoxia- 
associated differential expression of TET genes is induced by the HIF pathway, we 
performed HIF18 ChIP-seq. Because HIF1( is the obligate binding partner of all 
three HIFa proteins stabilized and activated upon hypoxia**, HIF1$ ChIP-seq 
reveals all direct HIF-target genes. 

Chromatin immunoprecipitation. Approximately 25 x 10° -30 x 10° cells were 
incubated in hypoxic conditions for 16h. Cultured cells were subsequently imme- 
diately fixed by adding 1% formaldehyde (16% formaldehyde (w/v), Methanol-free, 
Thermo Scientific) directly to the medium and incubating for 8 min. Fixed cells 
were incubated with 150\.M of glycine for 5 min to revert cross-links, washed 
twice with ice-cold PBS 0.5% Triton X-100, scraped and collected by centrifu- 
gation (1,000g for 5 min at 4°C). The pellet was re-suspended in 1,400 11 of RIPA 
buffer (50 mM Tris-HCl pH 8, 150mM NaCl, 2mM EDTA pH 8, 1% Triton X-100, 
0.5% sodium deoxycholate, 1% SDS, 1% protease inhibitors) and transferred to a 
new Eppendorf tube. The lysate was homogenized by passing through an insulin 
syringe, and incubated on ice for 10 min. The chromatin was sonicated for 3 min 
by using a Branson 250 Digital Sonifier with 0.7 s ‘On and 1.3 s ‘Off’ pulses at 
40% power amplitude, yielding a size of 100 to 500 bp. The sample was kept ice 
cold at all times during the sonication. The samples were centrifuged (10 min at 
16,000g at 4°C) and the supernatant were transferred in a new Eppendorf tube. The 
protein concentration was assessed using a BCA assay. Fifty microlitres of shared 
chromatin was used as ‘input’ and 1.4 1g of primary ARNT/HIEF-18 monoclonal 
antibody (NB100-124, Novus) per 1 mg of protein was added to the remainder of 
the chromatin, and incubated overnight at 4°C in a rotator. Pierce Protein A/G 
Magnetic Beads (Life Technologies) were added to the samples in a volume four 
times the volume of the primary antibody and incubated at 4°C for at least 5h. 
A/G Magnetic Beads were collected and the samples were washed five times with 
the washing buffer (50 mM Tris-HCl, 200 mM LiCl, 2mM EDTA, pH 8, 1% Triton, 
0.5% sodium deoxycholate, 0.1% SDS, 1% protease inhibitors), and twice with a 
10mM Tris, 1 mM EDTA (TE) buffer. The A/G magnetic beads were re-suspended 
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in 50 of TE buffer, and 1.5 11 of RNase A (200 units, NEB) were added to the A/G 
beads samples and to the input, incubated for 10 min at 37°C. After addition of 
1.5 11 of proteinase K (200 U) and overnight incubation at 65°C, the DNA was puri- 
fied using 1.8 volume of Agencourt AMPure XP (Beckman Coulter) according 
to the manufactory instructions, and then eluted in 15:1 of TE buffer. The input 
DNA was quantified on NanoDrop. 

ChIP-seq, mapping and analysis. In total, 51g of input and all of the immuno- 
precipitated DNA was converted into sequencing libraries using the NEBNext 
DNA library prep master mix set. A single end of these libraries was sequenced for 
50 bases on a HiSeq 2000, mapped using Bowtie and extended for the average insert 
size (250 bases). ChIP peaks were called by model-based analysis for ChIP-Seq*®, 
with standard settings and using a sequenced input sample as baseline. 
Patient-derived xenografted tumours. To assess whether tumour-associated 
hypoxia reduces 5hmC levels in vivo, redundant material from two endometrial 
tumours and a breast tumour, removed during surgery, was grafted in the inter- 
scapular region of nude mice. Informed consent was obtained from the patient, 
following the ethical approval of the local ethical committee. All animal experi- 
ments were approved by the local ethical committee (P098/2014). Each tumour 
was allowed to grow to lcm’, after which it was collected. 10% of this tumour was 
re-implanted in a nude mouse, and the tumour was propagated for three gener- 
ations until it was used for this experiment. To mark hypoxic areas, mice were 
injected with pimonidazole (60 mg kg !, Hypoxyprobe) ip. 1h before killing. 
Immunofluorescence staining and analysis. Tumours were collected, fixed in 
formaldehyde and embedded in paraffin using standard procedures. Paraffin was 
removed and slides were rehydrated in two xylene baths (5 min), followed by five 
3-min ethanol baths at decreasing concentrations (100%, 96%, 70%, 50% and 
water) and a 3-min TBS (50 mM Tris, 150mM NaCl, pH 7.6) bath. 

The following antibodies were used for immunofluorescence staining: pri- 
mary antibodies were FITC-conjugated mouse anti-pimonidazole (HP2-100, 
Hydroxyprobe), rabbit anti-ShmC (39791, Active Motif), rat anti-polyoma middle 
T (AB15085, Abcam), rat anti-CD31 (557355, BD Biosciences), rat anti-CD45 
(553076, BD Biosciences), rabbit anti-Ki67 (AB15580, Abcam) and mouse anti-pan 
cytokeratin (C2562, Sigma). Secondary antibodies were Alexa Fluor 405-conjugated 
goat anti-rabbit (A31556, Thermo Fisher), Alexa Fluor 647 conjugated goat anti-rat 
(A-21247, Life Technologies), peroxidase-conjugated goat anti-FITC (PA1-26804, 
Pierce), biotinylated goat anti-rat (A10517, Thermo Fisher) and biotinylated goat 
anti-rabbit (E043201, Dako). Signal amplification was performed using the TSA 
Fluorescein System (NEL701A001KT, Perkin Elmer) or the TSA Cyanine 5 System 
(NEL705A001KT, Perkin Elmer). 

Different protocols were implemented depending on the epitopes of interest. 
Staining for the following epitopes was combined: CD45, 5hmC, pimonidazole 
and DNA; PyMT, 5hmC, pimonidazole and DNA; Ki67, pimonidazole and DNA; 
CD31 and pimonidazole; and pan-cytokeratin, 5hmC, pimonidazole and DNA. 

Antigen retrieval for CD31, CD45 and pan-cytokeratin was done by a 7-min 
trypsin digestion, for pimonidazole and Ki67 using AgR at 100°C for 20 min, fol- 
lowed by cooling for 20 min. Slides were washed in TBS for 5 min, permeabilized 
in 0.5% Triton X-100 in PBS for 20 min. For 5hmC antigen retrieval, slides were 
denatured in 2M HCI for 10 min; HCl was neutralized for 2 min in borax, 1% in 
PBS pH 8.5, and washed twice for 5 min in PBS. 

For all slides, endogenous peroxidase activity was quenched using H2O> (0.3% 
in methanol), followed by three 5-min washes in TBS. Slides were blocked using 
pre-immune goat serum (X0907, Dako; 20% in TNB; TSA Biotin System kit, Perkin 
Elmer). Binding of primary antibodies (anti-5hmC, anti-CD45, anti-CD31 and 
anti-pan cytokeratin or FITC-conjugated anti-pimonidazole; all 1:100 in TNB) 
was allowed to proceed overnight. Slides were washed 3 times in TNT (0.5% 
Triton-X100 in TBS) for 5 min, after which the following secondary antibodies 
(all 1:100 in TNB with 10% pre-immune sheep serum) were allowed to bind for 
45 min: sheep-anti-FITC-PO (for pimonidazole), goat anti-rabbit-Alexa Fluor 405 
(for 5hmC), goat anti-rat-Alexa Fluor 647 (for CD45), and biotinylated goat anti- 
mouse (for pan-cytokeratin). Slides were washed three times for 5 min in TNT, after 
which signal amplification was performed for 8 min using Fluorescein Tyramide 
(1:50 in amplification diluent). 

Slides stained for pimonidazole that required co-staining for Ki67 or PyMT, 
or slides stained for pan-cytokeratin that required co-staining for pimonidazole 
were subjected to a second indirect staining for the latter epitopes. After 5 min of 
TNT and 5 min of TBS, slides were quenched again for peroxidase activity using 
H,0, and blocked using pre-immune goat serum, prior to a second overnight 
round of primary antibody binding (anti-Ki67, FITC-anti-pimonidazole or anti- 
PyMT, all 1/100). The next day, three 5-min washes with TNT were followed by 
a 1-h incubation with a biotinylated goat anti-rabbit antibody (for Ki67) or goat 
anti-rat (for PyMT), another three 5-min washes with TNT, a 30-min incubation 
with peroxidase conjugated to streptavidin (for Ki67 and PyMT) or to anti-FITC 
(for pimonidazole), another three 5-min washes with TNT and signal amplification 


for 8 min using, for pimonidazole, Fluorescein Tyramide and for others Cyanine 5 
Tyramide (1:50 in amplification diluent). Slides were then stained with propidium 
iodide with RNase (550825; BD biosciences) for 15 min, washed for 5 min in PBS 
and mounted with Prolong Gold (Life Technologies). 

Slides were imaged on a Nikon A1R Eclipse Ti confocal microscope. Three 
to five sections per slide were imaged, and processed using Image J. Nuclei were 
identified using the propidium iodide signal and nuclear signal intensities for 
Fluorescein and Cy3 (pimonidazole and 5hmC) measured. Analyses were exclu- 
sively performed on slide regions showing a regular density and shape of nuclei, 
in order to avoid inclusion of acellular or necrotic areas. The pimonidazole signal 
will also not stain necrotic/acellular areas*’, and was used to stratify viable cell 
nuclei into normoxic (pimonidazole negative) and hypoxic (pimonidazole positive) 
regions. The 5hmC signals in each population were compared using ANOVA. 
PyMT-negative and CD45-positive cells were counted directly. The fraction of 
pimonidazole and CD31-positive areas was directly quantified using ImageJ across 
ten images per slide. 

Metabolite and protein extraction. For metabolite extractions, 12-well cell culture 
dishes were placed on ice and washed twice with ice-cold 0.9% NaCl, after which 
50011 of ice-cold 80% methanol was added to each well. Cells were scraped and 
500.1 was transferred to a vial on ice. Wells were washed with 50011 80% methanol, 
which was combined with the initial cell extracts. The insoluble fraction was 
pelleted at 4°C by a 10-min 21,000g centrifugation. The pellet (containing the 
proteins) was dried, dissolved in 0.2 N NaOH at 96°C for 10 min and quantified 
using a bicinchoninic acid protein assay (BCA, Pierce), whereas the supernatant 
fraction was processed for metabolite profiling. 

Derivation and measurement of metabolites. The supernatant fraction containing 
the metabolites was transferred to a new vial and dried in a Speedvac. The dried 
supernatant fraction was dissolved in 45 11 of 2% methoxyamine hydrochloride 
in pyridine and held for 90 min at 37°C in a horizontal shaker, followed by 
derivatization through the addition of 60 jl of N-(tert-butyldimethylsilyl)- 
n-methyl-trifluoroacetamide with 1% tert-butyldimethylchlorosilane and a 60-min 
incubation at 60°C. Samples were subsequently centrifuged for 5 min at 21,000g 
and 85 1 was transferred to a new vial and analysed using a gas-chromatography 
based mass spectrometer (triple quadrupole, Agilent) operated in Multiple 
Reaction Monitoring (MRM) mode. 

Analysis of metabolite concentrations. For each sample, metabolite measure- 
ments were normalized per sample to the corresponding protein concentration 
estimates and expressed relative to control-treated samples. Four technical 
replicates were run for each sample, and the experiment was repeated 4 times 
using independent samples (n = 16). Differences in metabolite concentration were 
assessed using a two-tailed paired Student's t-test or using analysis of variance with 
post-hoc Tukey HSD when repeated measures were compared. 

ROS measurement using 2',7'-dichlorodihydrofluorescein diacetate. MCF7 
cells were cultured in 24-well plates and exposed to 21% (control) or 0.5% O2 
(hypoxia) for 24h. DMEM used for staining was pre-equilibrated to the required 
O; tension, and all steps performed at 21% (control) or 0.5% O. (hypoxia) using 
a glove box. The cells were washed twice with 50011 DMEM, and incubated for 
30 min in 2',7'-dichlorodihydrofluorescein diacetate (DCF-DA; 10,1M) in 500 
DMEM, keeping 2 wells unstained by DMEM without DCF-DA. Cells were treated 
with the indicated concentrations of H.O2 in DMEM for 30 min at 37°C, and fixed 
by adding 33.3 jl of 16% methanol-free paraformaldehyde (Thermo Fisher) for 
8 min at room temperature. The fixative was quenched using glycine (150,1M), 
cells were washed twice in ice-cold PBS, scraped to detach them and transfer them 
to pre-cooled FACS tubes over cell strainers. Cells were kept on ice until they were 
analysed by flow cytometry using a FACSVerse (BD Biosciences). 

Nuclear ROS measurement using nuclear peroxy emerald 1. MCF7 cells were 
seeded on 12-well glass-bottom plates and after 24h exposed to 21% (control) or 
0.5% Oz (hypoxia) for 24h. PBS used for subsequent staining was pre-equilibrated 
to the required O; tension, and all washing, treatment and staining steps were 
performed at the appropriate O2 tension (21% or 0.5%) using a glove box. Cells 
were loaded with nuclear peroxy emerald 1 (NucPE1; 5\1M)**°? and Hoechst 33342 
(10 xg ml~!) in PBS for 15min at 37°C. After washing three times in PBS, control 
cells were incubated with H2O (0.5mM in PBS) as a positive control, or with 
water (control and hypoxia cells) in PBS at 37°C for 20 min. Cells were washed 
three times in PBS, placed on ice and immediately imaged by confocal microscopy. 
The nuclear NucPE1 signal was measured, and averaged across >100 nuclei per 
replicate using ImageJ. This experiment was repeated 5 times on different days, 
and signals compared using a t-test. 

Cell growth measurement using Sulforhodamine B. 5,000 cells/well were seeded 
in three 96-well plates. After 48h, one plate was fixed using trichloroacetic acid 
(3.3% w/v) for 1h at 4°C, one plate incubated for 24h at 37 °C under hypoxic and 
one under control conditions (0.5% and 21% O., respectively). The latter 2 plates 
were subsequently also fixed using trichloroacetic acid (3.3% wt/vol) for 1h at 
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4°C, and all 3 plates were next analysed using the In vitro Toxicology Assay Kit, 
Sulforhodamine B-based (Sigma) as per the manufacturer's instructions. Growth 
inhibition was calculated as described”. 

siRNA transfection. siRNA ON-TARGETplus SMART pools (Thermo) were 
diluted in Optimem I reduced serum medium using Lipofectamine RNAiMAX 
(Life technologies) to reverse-transfect MCF7 cells in 10-cm dishes (for DNA) or 
6-well plates (for RNA). Cells were transfected 72 h before RNA and DNA extrac- 
tion as described. 

Hydroxylation assay using nuclear extracts. MCF7 cells were cultured for 24h 
under control or hypoxic conditions (21% or 0.5% Ob, respectively), chilled on 
ice and processed for extraction of nuclear proteins using the NE-PER Nuclear 
and Cytoplasmic Extraction Kit (Thermo Scientific). The activity of control 
and hypoxic extracts was assessed in parallel using the Colorimetric Epigenase 
5mC-Hydroxylase TET Activity/Inhibition Assay Kit (Epigentek) according to 
manufacturer’s instructions. Reactions were allowed to proceed for one hour, 
after which washing and detection of 5amC were done according to manufactur- 
er’s instructions. Differences between hypoxia and control were analysed using 
ANOVA, for 5 independent experiments. 

DNA hydroxymethylation assay using purified Tet enzyme. The genomic DNA 
used in this assay was extracted from Tet triple-knockout ES cells (G. -L. Xu), and 
it therefore was devoid of 5homC“". To enable efficient denaturation, it was digested 
using Msel before the assay and purified using solid phase reversible immobilisa- 
tion paramagnetic beads (Agencourt AMPure XP, Beckman Coulter). The assays 
were performed in Whitley H35 Hypoxystations (don Whitley Scientific) at 37°C, 
5% CO», No, with the following oxygen tensions: 0.1%, 0.3%, 0.5%, 1%, 2.5%, 5%, 
10% and 21%. Hypoxystations were calibrated less than 1 month before all exper- 
iments. Optimized assay components were as follows: 1.0 ,.g ul! bovine serum 
albumin (New England Biolabs), 50 mM Tris (pH 7.8), 100,1M dithiothreitol (Life 
Technologies), 2ng jl’ digested gDNA, 250M c-ketoglutarate, 830 |1M ascor- 
bate, 200,1M FeSO, and 45 ng jl! Tetl enzyme (Wisegene). The major assay com- 
ponents (HO, BSA and Tris) used for all samples were allowed to pre-equilibrate 
at 0.1% O> for 1h. These and the remaining assay buffer components (<100 11) 
were then pre-equilibrated at the desired oxygen tension for 15 min, and mixed 
before addition of Tet! enzyme in a total reaction volume of 25,11. Reactions were 
allowed to proceed for 3 min, longer incubations showed a decrease in activity. 
Reactions were stopped with 80 mM EDTA and stored at —80°C. To measure the 
resulting 5hmC content of the DNA, reactions were diluted to 100,11, denatured for 
10 min at 98°C and analysed in duplicate using the Global 5-hmC Quantification 
Kit (Active Motif) following manufacturer’s instructions. Michaelis-Menten and 
Lineweaver-Burk plots and the resulting Ky values were estimated using R. 
Hypoxia-induced changes in genomic distribution of 5(h)mC in MCE7 cells: 
DIP-seq. To assess where in the genome the levels of 5mC and 5hmC were altered, 
we performed DNA immunoprecipitations coupled to high-throughput sequenc- 
ing (DIP-seq). MCF7 cells were selected for these experiments as they were a cancer 
cell line with high levels of 5hmC and expression of TET genes under control 
conditions, and a cell growth that is unaffected by hypoxia. This enabled us to 
study the effects of hypoxia on TET activity in a cell line that shows high endog- 
enous activity, but that is isolated from hypoxia-induced changes in cell prolif- 
eration. MCF7 cell culture and DNA extractions were as described for LC-MS 
analyses. Library preparations and DNA immunoprecipitations were performed as 
described”, using established antibodies targeting 5mC (clone 33D3, Eurogentec,) 
and 5hmC (Active Motif catalogue number 39791). For 5SamC-DIP-seq, paired 
barcoded libraries prepared from DNA of hypoxic and control samples were mixed 
before capture, to enable a direct comparison of 54mC-DIP-seq signal to the input. 
A single end of these libraries was sequenced for 50 bases on a HiSeq 2000, mapped 
using Bowtie and extended for the average insert size (150 bases). Mapping statistics 
are summarized in Supplementary Information Table 11. 

For analysis of sequencing data, MACS peak calling, read depth quantification 
and annotation with genomic features as annotated in EnsEMBL build 77 was 
performed using SeqMonk. Differential (hydroxy-)methylation was quantified 
by EdgeR*, using either 3 or 5 independent pairs of control and hypoxic samples 
(for 55mC-DIP-seq and 5mC-DIP-seq, respectively). These cells were cultured and 
exposed to hypoxia (0.5% O>) or control conditions (21% Oz) on different days. 
Results were reported for 5hmC peak areas that exhibited a change significant at 
a P<0.05 and 5% FDR. 

Target enrichment BS-seq using SeqCapEpi. To confirm enrichment of 5mC 
at gene promoters using an independent method, DNA libraries were prepared 
using methylated adapters and the NEBNext DNA library prep master mix set 
following manufacturer recommendations. Libraries were bisulfite-converted using 
the Imprint DNA modification kit (Sigma) as recommended, and PCR ampli- 
fied for 12 cycles using barcoded primers (NEB) and the KAPA HiFi HS Uracil+ 
ready mix (Sopachem) according to manufacturer's instructions. Fragments were 
selected from these libraries using the SeqCapEpi CpGiant Enrichment Kit (Roche) 
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following the manufacturer's instructions, sequenced from both ends for 100 bases 
ona HiSeq 2000. 

For analysing these sequences, sequencing reads were trimmed for adapters 
using TrimGalore and mapped on a bisulfite-converted human genome (GRCh37) 
using BisMark. The number of methylated and un-methylated cytosines in cap- 
tured regions was quantified using Seqmonk for each experiment. Differential 
methylation of regions of interest was assessed by Fisher's exact test and for 5 
independent replicates grown on different days. t-scores were averaged following 
Fisher’s method. Mapping statistics are summarized in Supplementary Table 11. 
RNA-seq. To assess the effect of the increased 5mC occupancy at gene promoters 
on their expression, RNA-seq was performed. Briefly, total RNA was extracted 
using TRIzol (Invitrogen), and remaining DNA contaminants in 17-20 1g of 
RNA was removed using Turbo DNase (Ambion) according to the manufactur- 
er’s instruction. RNA was repurified using RNeasy Mini Kit (Qiagen). Ribosomal 
RNA present was depleted from 5g of total RNA using the RiboMinus Eukaryote 
System (Life technologies). cDNA synthesis was performed using SuperScript III 
Reverse Transcriptase kit (Invitrogen). 3 j1g of Random Primers (Invitrogen), 8 il 
of 5x First-Strand Buffer and 10,11 of RNA mix was incubated at 94°C for 3 min 
and then at 4°C for 1 min. 2,11 of 1OmM dNTP Mix (Invitrogen), 4,11 of 0.1M 
DTT, 2,11 of SUPERase In RNase Inhibitor 20U jl! (Ambion), 2411 of SuperScript 
III RT (200 U pl!) and 84 of Actinomycin D (11g il!) were then added and 
the mix was incubated for 5 min at 25°C, 60 min at 50°C and 15 min at 70°C to 
heat-inactivate the reaction. The cDNA was purified by using 80,11 (2x volume) 
of Agencourt AMPure XP and eluted in 5011 of the following mix: 5 11 of 10x 
NEBuffer 2, 1.5 ul of 1OmM dNTP mix (10mM dATP, dCTP, dGTP, dUTP, 
Sigma), 0.1 1l of RNaseH (10 Unl-!, Ambion), 2.5111 of DNA Polymerase I Klenov 
(10 Upl!, NEB) and the remaining volume of water. The eluted cDNA was incu- 
bated for 30 min at 16°C, purified by Agencourt AMPure XP and eluted in 3011 
of dA-Tailing mix (2 11 of Klenow Fragment, 3 1l of 10x NEBNext dA-Tailing 
Reaction Buffer and 25.1 of water). After 30 min incubation at 37°C, the DNA 
was purified by Agencourt AMPure XP, eluted in TE buffer and quantified on 
NanoDrop. Subsequent library preparation was performed using the DNA library 
prep master mix set and sequencing was performed as described for ChIP-seq. 
Expression levels (reads per million) of genes displaying significant increases in 
methylation at their gene promoter, as determined using SeqCapEpi, was com- 
pared between control and hypoxic samples using a t-test. Mapping statistics are 
summarized in Supplementary Table 11. 

TCGA samples and data analysis. From the TCGA pan-cancer analysis, we 
selected all solid tumour types for which >100 tumours were available with both 
gene expression data (RNA-seq) and DNA methylation data (Illumina Infinium 
HumanMethylation450 BeadChip). These were 408 bladder carcinomas, 691 breast 
carcinomas, 243 colorectal adenocarcinomas, 520 head and neck squamous cell 
carcinomas, 290 kidney renal cell carcinomas, 430 lung adenocarcinomas, 371 lung 
squamous cell carcinomas, and 188 uterine carcinomas, representing in total 3,141 
unique patients. Corresponding RNA-seq read counts as well as DNA methylation 
data from Infinitum HumanMethylation450 BeadChip arrays were downloaded 
from the TCGA server. Breast tumour subtypes were annotated for 208 tumours 
and, for the remaining tumours, imputed by unsupervised hierarchical clustering 
of genes in the PAM50 gene expression signature“*, Other clinical and histological 
variables were available for >95% of tumours, and missing values were encoded as 
not available. Gene mutation data was available for 129 bladder carcinomas, 646 
breast carcinomas, 200 colorectal adenocarcinomas, 306 head and neck squamous 
cell carcinomas, 241 kidney renal cell carcinomas, 182 lung adenocarcinomas, 74 
lung squamous cell carcinomas, and 3 uterine carcinomas. 

Stratification of tumours for hypoxia and proliferation. To identify which of 
these tumour samples were hypoxic or normoxic, we performed unsupervised 
hierarchical clustering based a modification (Ward.D of the clusth function in R's 
stats package) of the Ward error sum of squares hierarchical clustering method*, 
on normalized log-transformed RNA-seq read counts for 14 genes that make up 
the hypoxia metagene signature (ALDOA, MIF, TUBB6, P4HA1, SLC2A1, PGAM1, 
ENO1, LDHA, CDKN3, TPI1, NDRG1, VEGFA, ACOT7 and ADM)”. In each case 
the top 3 sub-clusters identified were annotated as normoxic, intermediate and 
hypoxic. To identify which of these tumour samples were high- or low-proliferative, 
we performed unsupervised hierarchical clustering based a modification (Ward.D 
of the clusth function in R's stats package) of the Ward error sum of squares hier- 
archical clustering method“, and this for all genes annotated to an established 
tumour proliferation signature (MKI67, NDC80, NUF2, PTTG1, RRM2, BIRCS5, 
CCNB1, CEP55, UBE2C, CDC20 and TYMS)**. Tumours in the top 2 sub-clusters 
identified were labelled as high- or low-proliferative. 

Analysis of the top 1000 CpGs most hypermethylated versus normal tissue. To 
identify tumour-associated hypermethylation events, we compared 450k meth- 
ylation data from tumours and normal tissues. All available DNA methylation 
data from normal tissue (matched or unmatched to tumour samples, on average 
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59 per tumour type, representing 472 in total, range = 21-160) were downloaded. 
For each of the 8 tumour types investigated, we selected the top 1,000 CpGs that 
showed the highest average tumour-associated increases in DNA methylation. Per 
tumour type, unsupervised hierarchical clustering based on a modification of the 
Ward error sum of squares hierarchical clustering method (Ward.D of the clusth 
function in R’s stats package)* annotated the first 3 clusters identified as having 
low, intermediate and high hypermethylation. Cluster co-membership for methyla- 
tion and hypoxia metagene expression were analysed using the Cochran—Armitage 
test for trend. Analyses using the top 100, 500, 5,000 or 10,000 CpGs yielded near 
identical results (not shown). 

Analysis of hypermethylation events. We next applied a method to identify those 
CpGs that exhibit exceptional increases in hypermethylation but that are hyper- 
methylated only in a subset of all tumours. Such rare events are typically found 
in cancer, where hypermethylation inactivates a gene in only a subset of tumours. 
Hypermethylation of individual CpGs at gene promoters (that is, on average 3.7 
CpGs per promoter are represented on the 450K array) in individual tumours was 
assessed as follows: To achieve a normal distribution, all 3-values were transformed 
to M-values”” using M = log,(/(1 — 8)). For each tumour type, the mean ji and 
standard deviation o of the M value across all control (normoxic) tumours was 
next calculated, irrespective of mutational status, for each CpG, and used to assign 
Z-values to each CpG in each tumour using Z = (M — 1)/c. These Z-values describe 
the deviation in normal methylation variation for that probe. To identify CpGs that 
display an extreme deviation, we selected those for which the Z-value exceeded 
5.6 (that is, 1 + (5.6 x o), corresponding to a Bonferroni-adjusted P value of 
0.01); they were considered as hypermethylation events in that particular tumour. 
This analysis was preferred over Wilcoxon-based models that assess differences 
in the average methylation level between subgroups, as the latter do not enable the 
identification or quantification of the rarer hypermethylation events in individual 
promoters or CpGs. 

To identify genes with frequently hypermethylated CpGs in their promoter, the 
number of hypermethylation events in that promoter was counted in all tumours, 
and contrasted to the expected number of hypermethylation events in that pro- 
moter (that is, the general hypermethylation frequency multiplied by the number 
of CpGs assessed in that promoter multiplied by the number of tumours) using 
Fisher's exact test. Genes with an associated Bonferroni-adjusted P value below 0.01 
were retained and considered as frequently hypermethylated in that tumour type. 

To assess what fraction of these hypermethylation events are hypoxia-related, 
we assumed that the fraction of events detected under normoxia was hypoxia- 
unrelated, and that all excess events detected in intermediate and hypoxic tumours 
were hypoxia-related. For example, in 691 breast carcinomas, 0.25% of CpGs 
were hypermethylated in 251 normoxic tumours, 0.81% in 350 intermediate and 
1.40% in 90 hypoxic tumours. So, 0.56% and 1.15% of hypermethylation events 
in respectively intermediate and hypoxic tumours were hypoxia-related. Taking 
into account the number of tumours, 0.25% of hypermethylation events (that is, 
(0.25% x 251 +0.25% x 350 + 0.25% x 90)/691) are not hypoxia-related, and 
0.43% are hypoxia related (that is, (0% x 251 + 0.56% x 350 + 1.15% x 90)/691). 
So, 63% of all hypermethylation events combined (that is, 0.43/(0.43 + 0.25)) are 
hypoxia related. To assess the contribution of hypoxia to hypermethylation relative 
to other covariates, partial R? values were calculated for the contribution of each 
covariate in a linear model, and compared to the total R? achieved by the model. 

To identify genes with CpGs in their promoter that are more frequently hyper- 

methylated in hypoxic than normoxic tumours, the number of hypermethylation 
events in that promoter was counted in all hypoxic tumours, and contrasted to 
the number found in normoxic tumours. Differences in frequencies were assessed 
using Fisher's exact test, and genes with a Bonferroni-adjusted P < 0.01 were 
retained and considered hypermethylated upon hypoxia. Enrichment of ontologies 
associated with these genes was assessed using Fisher’s exact test as implemented 
in R’s topGO package. 
Analysis of the effect of hypermethylation events on the expression of asso- 
ciated genes. To enable a direct comparison between the expression of different 
genes, we transformed gene expression values (reads per million) to their respective 
z-scores. To assess the impact of hypermethylation events on the expression of 
associated genes, we compared the expression z-scores of all frequently hypermeth- 
ylation genes that contain one or more hypermethylation events in their promoter 
(on average each promoter contains 3.7 CpGs; if one of these is hypermethylated 
the associated gene is considered hypermethylated in that particular tumour), to 
the expression of all frequently hypermethylated genes that do not contain a hyper- 
methylation event. The effect of hypermethylation on gene expression was plotted 
for the 8 main tumour types stratified into normoxic, intermediately hypoxic and 
hypoxic tumours, and for glioblastomas was stratified into normoxic, intermediately 
hypoxic, hypoxic and JDH-mutant tumours (n= 4). The difference in expression 
z-scores between genes not carrying and carrying a hypermethylation event in 
their promoter was assessed using a t-test. 


Analysis of the effect of frequent mutations on the occurrence of hypermeth- 
ylation events. To assess the impact of somatic mutations on hypoxia-associated 
hypermethylation frequencies, we analysed the top 20 genes described to be most 
frequently mutated in the pan-cancer analysis”*, and supplemented this list with 
genes known to cause hypermethylation upon mutation (that is, IDH1, IDH2, 
SDHA, FH, TET1, TET2 and TET3). Mutations in IDH1 and IDH2 were retained 
if they respectively affected amino acid R132, and amino acids R140 or R172. 
Mutations in other genes were scored using Polyphen, and only mutations classified 
as probably damaging were retained. 7 mutations were found in lung tumours, 3 
mutations in colorectal tumours, 8 mutations in breast tumours and 6 mutations 
(all IDH1*") in glioblastomas. None of these mutations were enriched in hypoxic 
subsets. In multivariate analyses of variance, in each of the tumour types analysed, 
mutations in these genes were significantly associated with increased hypermeth- 
ylation frequencies. Hypoxia was independently and significantly associated with 
the hypermethylation frequency. 

Correlation between hypermethylation and expression of TET or DNMT 
enzymes. Gene expression values (reads per million) of DNMT and TET enzymes 
were determined for each tumour using available RNA-seq data. The number of 
hypermethylation events at significantly hypermethylated genes in each tumour 
was determined as described above. Hypermethylation in each tumour was sub- 
sequently correlated to TET or DNMT gene expression in that tumour, correcting 
for hypoxia and proliferation status using ANOVA. 

5mC and 5hmC profiling using 450k arrays for 24 lung tumours. Newly diag- 
nosed and untreated non-small-cell lung cancer patients scheduled for curative- 
intent surgery were prospectively recruited. Included subjects had a smoking 
history of at least 15 pack-years. The study protocol was approved by the Ethics 
Committee of the University Hospital Gasthuisberg (Leuven, Belgium). All partici- 
pants provided written informed consent. In the framework of a different project*’, 
RNA-seq was performed on 39 tumours from these patients. Gene expression for 
these samples was clustered for their hypoxia metagene signature”’. This yielded 
2 clear clusters, containing 24 and 15 normoxic and hypoxic tumours, respectively. 
Twelve samples were randomly selected from each cluster, in order to perform 
5hmC and 5mC profiling. 

Illumina Infinitum HumanMethylation450 BeadChips. For TAB-ChIP, DNA 
was glycosylated and oxidized as described*’, using the 5amC TAB-Seq Kit 
(WiseGene). Subsequently, bisulfite conversion, DNA amplification and array 
hybridization were done following manufacturer's instructions. 

Analysis of TAB-ChIP and BS-ChIP. Data processing was largely as described”. 
In brief, intensity data files were read directly into R. Each sample was normalized 
using Subset-quantile within array normalization (SWAN) for Illumina Infinium 
HumanMethylation450 BeadChips™. Batch effects between chips and experiments 
were corrected using the runComBat function from the ChAMP bioconductor 
package®!. For obtaining 5mC-specific beta values, TAB-ChIP generated normal- 
ized beta values were substracted from the standard 450K generated normalized 
beta values, exactly as described*°. 

Mouse cancer models. All the experimental procedures were approved by the 
Institutional Animal Care and Research Advisory Committee of the KU Leuven. 
Hypoxia induction using sFlk1-overexpression. For sFlk1-overexpression studies, 
male Tg/MMTV-PyMT) FVB mice were intercrossed with wild-type FVB female 
mice. Female pups of the Tg/MMTV-PyMT) genotype were retained, and tumours 
allowed to develop for 9 weeks. Subsequently, 2.5 1g of plasmid (sFlk1-overexpressing 
or empty vector; randomly assigned within litter mates) per gram of mouse 
body weight was introduced in the bloodstream using hydrodynamic tail vein 
injections”. sFlk1 overexpression was monitored at 4 days after injection and at 
the day of killing (18 days after the injection), by eye bleeds followed by an ELISA 
assay for sFlk1 (R&D Systems) in blood plasma. At 12 weeks of age, mice were 
killed and mammary tumours collected were blinded for treatment. 

Hypoxia reduction using Phd2 haplodeficiency. For the Phd2‘’~ experiments, 
male Tg(/MMTV-PyMT) FVB mice were intercrossed with female Phd27'+ 
mice, yielding litters of which half have either a Tg(/ MMTV-PyMT) genotype or 
a Tg(MMTV-PyMT);Phd2~'* genotype. For the Phd2"" experiments, male 
Tg(MMTV-PyMT) FVB mice were intercrossed with female Tie2-Cre;Phd2“"/" 
mice as described”, yielding litters of which half have either a Tie2-cre;Tg(MMTV- 
PyMT);Phd2“ "7 genotype or a Tie2-cre;Tg(MMTV-PyMT);Phd2~'* genotype. 
At 16 weeks of age, female mice were killed and mammary tumours collected. 
qPCR analysis of expression of Tets and marker genes. RNA was extracted from 
fresh-frozen tumours (stored at —8 °C) using TRIzol (Life Technologies), and con- 
verted to cDNA and quantified as described for the cell lines. TaqMan probes and 
primers (IDT or Life Technologies) are described under Supplementary Table 12. 
TAB-seq of PyMT tumours. TAB-seq libraries were prepared as described, 
using the 5amC TAB-Seq Kit (WiseGene). DNA was bisulfite-converted using 
the EZ DNA Methylation-Lightning Kit (Zymo Research) and sequenced as 
described for SeqCapEpi experiments. Reads were mapped to the mouse genome 
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(build Mm9) and further data processing was as for SeqCapEpi experiments. DNA 
from 3 independent tumours was selected per condition. TET oxidation efficiency 
was required to exceed 99.5% as estimated using a fully CG-methylated plasmid 
spike-in, 5hmC protection by glycosylation was 65% as estimated using a fully 
hydroxymethylated plasmid spike-in, bisulfite conversion efficiencies were esti- 
mated to exceed 99.8% based on nonCG methylation (equal to percentage hyper- 
methylated CpG). Mapping statistics are summarized in Supplementary Table 11. 
Targeted deep BS-seq. As no mouse capture kit was available for targeted BS-seq, 
a specific ampliconBS was developed for a set of 15 tumour suppressor gene pro- 
moters and 5 oncogene promoters. More specifically, DNA was bisulfite-converted 
using the Imprint DNA modification kit and amplified using the MegaMix Gold 
2x mastermix and validated primer pairs. Per sample, PCR products were mixed to 
equimolar concentrations, converted into sequencing libraries using the NEBNext 
DNA library prep master mix set and sequenced to a depth of ~500 x. Mapping 
and quantification were done as described for SeqCapEpi. The average and 
variance of methylation level M values in normal mammary glands were used as 
baseline, and amplicons displaying over 3 standard deviations more methylation 
(FDR-adjusted P < 0.05) than this baseline were called as hypermethylated. At 
least 9 different tumours, each from different animals, were profiled per genotype 
or treatment, and differences in hypermethylation frequencies between sets of 
tumours were assessed using Mann-Whitney’s U-test. 

Statistics. Data entry and analysis were performed in a blinded fashion. Statistical 
significance was calculated by two-tailed unpaired t-test (Excel) or analysis 
of variance (R) when repeated measures were compared. Data were tested for 
normality using the D’Agostino—Pearson omnibus test (for n > 8) or the 
Kolmogorov-Smirnov test (for n < 8) and variation within each experimental 
group was assessed. Data are presented as mean + s.e.m. DNA methylation and 
RNA-seq gene expression data distributions were transformed to a normal distri- 
bution by conversion to M values and log, transformation respectively. Sample sizes 
were chosen based on prior experience for in vitro and mouse experiments, or on 
sample and data availability for human tumour analyses. Other statistical methods 
(mostly related to specific sequencing experiments) are described together with 
the experimental details in other sections of the methods. 
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Extended Data Figure 1 | Hypoxia-induced changes in 5hmC, 5mC and breast cancer (MCF7 and 4T1), fibrosarcoma (HT1080), neuroblastoma 
TET expression. Global 5hmC/C and 5mC/C content of DNA, TET1, TET2 (SK-N-Be2c and SHSY5Y), normal breast epithelium (MCF10A) and the 
and TET3 mRNA expression and hypoxia marker gene expression in 15 cell inner cell mass of blastocyst-stage mouse embryos (mES and Tet1-KO ES 
lines grown for 24 h under normoxic (21% Op, white) or hypoxic (0.5% O2, cells). ALDOA and BNIP3 are expected to be increased, and HIFIA to be 
red) conditions. TET mRNA copy number is expressed relative to B2M for decreased upon hypoxia. The global 5fC content of ES cells is depicted, but 
human cell lines (HepG2, HT-1080, MCF10A, H358, MCF7, Hep3B, A549, was undetectable in cancer cell lines. Bars represent the mean + s.e.m. of 
H1299, SK-N-Be2c and SHSY5Y), and to Hprt for mouse cell lines (LLC, five different replicate samples. DNA and RNA from these replicates were 
N2A, 4T1, mES and Tet1-KO ES cells). Shown are cell lines derived from liver _ extracted from cells derived from the same stock vial but grown on different 
cancer (HepG2 and Hep3B), lung cancer (H358, A549, H1299 and LLC), days. *P<0.05, **P<0.01, ***P < 0.001 by paired t-tests. 
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Extended Data Figure 2 | See next page for caption. 
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Extended Data Figure 2 | Impact of hypoxia on TET expression. 

a, Changes in Tet1, Tet2 and Tet3 expression in mouse cell lines, at the 
protein level (top row, n=6) and the mRNA level (bottom row, n=5). 
Middle row: representative immunoblot images of Hifla, Tet1, Tet2 

and Tet3. a-Tubulin serves as loading control, and expression of the 
corresponding coding gene (Tuba1a) was used to normalize mRNA 
expression, enabling a direct comparison of relative protein and relative 
mRNA expression changes. For the same reason, mRNA expression 
was depicted relative to control conditions, in contrast to the absolute 
levels shown in Extended Data Fig. 1. Changes in Tet mRNA and protein 
expression correlate strongly (Pearson’s R= 0.855, P=4 x 10~*). For 
example, both 4T1 and N2A cells displayed increased Tet2 expression at 
the protein and mRNA level. Likewise, ES cells showed no pronounced 
changes at the protein or mRNA level. The overall expression of Tet 
enzymes was not altered in any of these cell lines. For gel source data, 


see Supplementary Fig. 1. b, Hif18 ChIP-seq at the promoters of TET1, 
TET2 and TET3 and at hypoxia markers genes (Bnip3 and Aldoa), with 
peaks or promoter regions highlighted using coloured boxes. Green and 
red boxes correspond to overexpression and no overexpression (specified 
in the figure panel) of the corresponding gene, respectively, as determined 
using TaqMan in Extended Data Fig. 1. Scale: reads per million reads and 
per base pair. c, Left, TET2 expression in MCF7 cells transfected with 
control (white) or TET2-targeting (purple) siRNAs. Right, corresponding 
5hmC levels as determined using LC-MS. d, 5hmC levels as determined by 
LC-MS, in wild-type (white) and Tet1-knockout (purple) ES cells grown 
under 21% (left) and 0.5% (right) O» tensions. Bars in c and d represent 
the mean + s.e.m. of five replicate samples from cells derived from 

the same stock vial but grown on different days. *P < 0.05, **P<0.01, 
** P < 0.001 by paired t-tests (a, c, d). 
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Extended Data Figure 3 | See next page for caption. 
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Extended Data Figure 3 | Effects secondary to hypoxia. a~e, ROS 
production and redox state of MCF7 cells cultured for 24 h under control 
(21% Oo, white) or hypoxic (0.5% Oz, red) conditions. Shown are capillary 
gas chromatography mass spectrometry (GC-MS) quantifications of 
changes in the cellular energy state as represented by the adenylate energy 
charge (AEC) (calculated as [ATP + 0.5 x ADP]/[ATP + ADP + AMP]) 
(a); the reducing equivalents of the cell as represented by the relative 
NADH and NADPH levels (calculated as NADH/[NAD* + NADH] and 
NADPH/[NADP* + NADPH)]); and the reductive capacity of the cell 

as represented by the levels of glutathione (calculated as GSH/[GSH + 
GSSG x 2]). b, c, Quantification (b) and representative FACS intensity 
traces (c) of total ROS levels in MCE7 cells exposed to hypoxia or H,O5, 

as assessed using 2°,7°-dichlorodihydrofluorescein diacetate (DCF-DA). 
d, Nuclear ROS in MCF7 cells as assessed using the nuclear peroxy 
emerald 1 probe (NucPE1)*”. MCE7 cells were exposed to 21% (control) or 
0.5% (hypoxia) O» for 24h, after which live cells were loaded with NucPE1 
(541M) and Hoechst 33342 (101g ml~') in O, pre-equilibrated PBS for 

15 min. After washing, control cells were incubated with H2O> (0.5 mM in 
PBS) as a positive control, or with water (control and hypoxia cells) in PBS 
for 20 min. Cells were washed again and immediately imaged by confocal 
microscopy. Representative images are shown. Scale bar, 50 jm. e, The 
nuclear NucPE] signal, averaged across >100 nuclei and expressed relative 


to control conditions. f, LC-MS quantification of 8-oxoG concentrations 
in DNA of cells lines cultured for 24h under control (21% O2, white) 

and hypoxic (0.5% Op, red) conditions. 8-oxoG serves as a marker of 
nuclear ROS*. g-i, GC-MS quantification of changes in the indicated 
metabolite levels in mouse ES cells (g), MCF10A cells (h) and MCF7 

cells (i) grown for 24h under control (21% O5, white), hypoxic (0.5% 

Oz, red) or glutamine-free (21% Oz, black) conditions. j, Quantities of 
a-ketoglutarate and 2-hydroxyglutarate in MCF7 cells, expressed relative 
to a-ketoglutarate levels in MCF7 cells grown under control conditions 
(21% O2). k, LC-MS quantification of 5hmC levels in response to hypoxia 
(0.5% O2) and glutamine-free culture conditions. 1], Growth of cell lines 
cultured for 24h under control (21% O», white) and hypoxic (0.5% Os, red) 
conditions, as assessed using a sulforhodamine B colorimetric assay. 
Changes in cell density after 24 h are depicted relative to control 
conditions (21% Oz). m, IOX2-induced changes in the global 5hmC 
content of DNA, in TET mRNA expression and in hypoxia marker gene 
expression of five cell lines treated for 24h with DMSO (carrier, white) 

or IOX2 (501M, blue). n, 5mC hydroxylation activity of nuclear lysates 
from MCE7 cells grown for 24h under 21% or 0.5% O> (white or red). 
Bars represent the mean + s.e.m. of 5 (b, k, m), 6 (a, c-e), 16 (g-j) or 24 (1) 
samples prepared on different days. *P < 0.05, **P <0.01, ***P < 0.001 by 
t-test (b, e, h-m). 
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Extended Data Figure 4 | Genomic profiles of 5mC and 5hmC. Shown and a 5mC gain that is more subtle, perhaps because the resolution of 
are results from DIP-seq of DNA from MCF7 cells cultured for 24h under = 5mC-DIP-seq is limiting: regions rich in 5hmC tend to be poorer in 
21% or 0.5% O> (control and hypoxia), with examples of 54mC-DIP-seq 5mC™, and thus have less substrate available for pull-down. 5mC-DIP-seq 
(top) and 5mC-DIP-seq (bottom) read depths (FPM, fragments per base moreover captures all methylated sites, so most of the 5nC-DIP-seq signal 
pair per million fragments) at regions surrounding the transcription start does not derive from sites that are actively turning over 5hmC. 


site of NSD1, FOXA1 and CDKN2A. These show 5hmC loss (FDR < 5%) 
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Extended Data Figure 5 | See next page for caption. 
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Extended Data Figure 5 | Effect of hypoxia on hypermethylation 
frequency in tumours. a, Immunofluorescence analysis of patient- 
derived tumour xenografts, stained for pimonidazole (white), 5hmC (red), 
DNA (propidium iodide, blue) and pan-cytokeratin (green). Shown are 
representative images of a breast and two endometrial tumour xenografts. 
The inset on the right shows box plots illustrating the signal in normoxic 
pimonidazole-negative nuclei (blue), and in hypoxic pimonidazole- 
positive nuclei (red). b, Hypoxia marker gene expression clusters, with 
the first three clusters used to define normoxic, intermediate and hypoxic 
tumours. c, Unsupervised clustering of 1,000 CpGs showing the highest 
average methylation increase in tumour versus corresponding normal 
tissues. The first three clusters were used to define tumours of low, 
intermediate and high hypermethylation. The colour bar above the 
clusters annotates each tumour as normoxic, intermediate or hypoxic, 

as determined in b. d, Box plots showing the relative expression (z-score) 
of genes in tumours in which they have either 0 or >1 hypermethylation 
event in their promoter, stratified into normoxic, intermediate and 
hypoxic tumours (blue, grey and red, respectively). Diamonds indicate 
mean, box plot wedges indicate 2 x the standard error of the median. 
Genes with >1 hypermethylation events in their promoters have a lower 
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average expression level (P< 0.01 for each tumour type). e, Fraction of 
genes having a promoter that is rich, intermediate or poor in CpGs, out of 
all gene promoters that are assessed on the 450k array, and out of all gene 
promoters that are frequently hypermethylated in the indicated tumour 
types. f, Fraction of 1,742 TET wild-type tumours and 39 TET mutant 
that are normoxic, intermediate and hypoxic. P > 0.2 for all comparisons. 
g, Cell proliferation marker gene*® expression clusters, with the first two 
clusters used to define high-proliferative and low-proliferative tumours. 
h, hypermethylation frequencies in low- and high-proliferative tumours, 
with asterisks representing P values from linear models correcting 

for variables specified in Supplementary Table 8. i, Partial correlation 
coefficient (partial R”) estimates of the relative contribution of tumour 
characteristics (annotated in TCGA) to the variance in hypermethylation 
observed in these tumours. Partial R* values were obtained from linear 
model estimation using ordinary least squares, and expressed as a fraction 
of the total variance (that is, total R?) explained by the model when taking 
into account all indicated variables, as indicated between brackets under 
each tumour type. *P < 0.05, **P <0.01, ***P < 0.001 by ¢-test (a) or by 
generalized linear model (h). 
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Extended Data Figure 6 | Functional annotation of genes more 
frequently hypermethylated in hypoxic tumours. a, Ontology terms 
enrichment analysis of genes that are more frequently hypermethylated at 
their gene promoters in hypoxic than normoxic tumours, for eight tumour 
types characterized in the TCGA pan-cancer effort. A representative 

set of terms is displayed, selected from terms enriched in most tumour 
types. P values as defined by the grey-scale insert. Enrichment calculated 


using topGO. b, Selected examples of hypermethylation frequencies 

in the promoter of key TSGs (PTEN, CDKN1C, ATM) more frequently 
hypermethylated in normoxic than hypoxic tumours. c, Hypermethylation 
frequency in the promoter of selected genes involved in the processes 
indicated. P < 0.05 for all genes (asterisks are not displayed). Bars in 

b and c represent the hypermethylation frequency + s.e.m. P values in 

a by Fisher’s exact test. 
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Extended Data Figure 7 | Effect of hypoxia on TET activity in human 
tumours. a, The f-value of correlation between hypermethylation 

and expression of TET or DNMT genes across 3,141 tumours of 

8 tumour types (bladder, breast, colorectal, head and neck, kidney, lung 
adenocarcinoma, lung squamous, and uterine carcinoma) profiled in 
TCGA for gene expression and DNA methylation, while correcting 

for tumour type, hypoxia and proliferation. The dotted line represents 
P<0.05, negative t-values represent inverse correlations. b, Hypoxia 
metagene signature applied to 63 glioblastoma multiforme tumours from 
TCGA. ¢, Boxplots showing the relative expression (z-score) of genes in 
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tumours in which they have either 0 or >1 hypermethylation event in 
their promoter, stratified into wild-type IHD1 tumours that are normoxic 
(n= 19), intermediate (n = 21) and hypoxic (n= 17) (blue, grey and red, 
respectively), and IDH1*!°8-mutated tumours (n = 4, yellow). Diamonds 
indicate mean, box plot wedges indicate 2x the standard error of the 
median. Genes with >1 hypermethylation events in their promoters have a 
lower average expression level. No hypermethylation events were detected 
in wild-type IHD1 normoxic tumours. d, Hypoxia metagene signature 
applied to 12 normoxic and 12 hypoxic non-small-cell lung tumours. 
*P<0.05, ***P < 0.001 by t-test (c). 
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Extended Data Figure 8 | See next page for caption. 
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Extended Data Figure 8 | 5hmC, hypoxia and TSG hypermethylation 
in mouse breast tumours. a, Frequency of hypermethylation events in 
the promoters of all genes, all oncogenes and all TSGs as annotated”®, 

in 695 human breast tumours available through TCGA and stratified into 
normoxic, intermediate and hypoxic subsets. b, c, DNA was extracted 
from 53 tumours developing in MMTV-PyMT mice of the indicated 

ages (c) or weights (d) and sequenced to a depth of ~500 x. Plotted are 
z-scores of hypermethylation (y axis, exponential) for 15 TSGs, relative 
to the tumours from 11-week-old mice. The dotted line represents the 
threshold for a Bonferroni-adjusted P < 0.05, and bold darker dots are 
used for tumours displaying significantly increased hypermethylation 
events. d, DNA extracted from 20 normal mammary glands from 
14-week-old mice, PCR-amplified for 15 TSGs and sequenced to a depth 
of ~500x. Plotted are z-scores of hypermethylation relative to 11-week-old 
tumours. e, Staining of PyMT tumours for 5hmC (red), DNA (propidium 
iodide, blue), pimonidazole (white) and PyMT (green), and fraction of 
PyMT-positive cells in normoxic and hypoxic areas. The area outlined 
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corresponds to the hypoxic, pimonidazole-positive section, arrowheads 
point to PyMT-negative cells. Scale bar, 251m. The bar chart inset 
illustrates the relative number of PyMT-positive cells in normoxic and 
hypoxic areas (grey and red, respectively; n = 19). f, Ki67-positive cells 

in PyMT tumours: representative image of staining for DNA (propidium 
iodide, blue), Ki67 (red) and pimonidazole (green). Scalebar, 50 jim. 

The bar chart inset illustrates the quantification of Ki67-positive cells in 
normoxic and hypoxic areas (grey and red, respectively) across 6 tumours, 
analysing 3 fields of view with over 150 cells per field of view. g, CD45- 
positive cells in PyYMT tumours: representative image of staining for DNA 
(propidium iodide, blue), 5hmC (red), pimonidazole (green) and CD45 
(white). Scale bar, 100 jum. The bar chart inset illustrates the quantification 
of CD45-positive cells in normoxic and hypoxic areas (white and red, 
respectively) of 11 tumours, capturing on average ~2,500 nuclei per 
analysis. ***P < 0.001 in (a) by Fisher’s exact test, significance relative to 
all genes. 
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Extended Data Figure 9 | See next page for caption. 
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Extended Data Figure 9 | Manipulation of tumour oxygenation in 
mouse breast tumours, and effects on 5hmC, TSG hypermethylation 
and confounders. a, Plasma sFlk1 concentrations at the indicated 

times after hydrodynamic injection with an empty (n = 7) or sFIk1- 
overexpression plasmid (n= 5) (grey and red, respectively). b, c 
Quantification of tumour vessel number (b) and hypoxic areas (c) of 
tumours from transgenic MMTV-PyMT mice, hydrodynamically injected 
with an empty or sFlk1-overexpression plasmid, with representative 
images of blood vessels stained for CD31 (b) and hypoxic areas stained 
for pimonidazole adducts (c). Scale bar, 100j1m. d, Changes in RNA 
expression of hypoxia marker genes that are known to be downregulated 
(Mrc1) or upregulated (Bnip3, Car9, Ddit4) in hypoxic conditions. 

e, 5hmC levels (y axis) across mouse chromosome 18 (x axis) in 400 kb 
bins, with the location of RefSeq genes (middle), and differences in 5hmC 
levels (lower). 5hmC levels were determined using shallow TAB-seq, and 
chromosome 18 was selected because it has large stretches of gene 

deserts that illustrate the 5hmC depletion in these areas (n = 3). 5hmC 
levels decrease by 12.4% + 3.5 after sFlk1 overexpression, although 
technical limitations of TAB-seq (incomplete 5hmC protection or 
bisulfite-conversion) may partially obscure the magnitude of effects. 

f, Hypermethylation in tumours developing in 12-week-old mice receiving 
hydrodynamic injection with an empty (n = 19) or sFlk1-overexpressing 
plasmid (n = 24) 3 weeks earlier. DNA was bisulfite-converted, 
PCR-amplified for the indicated oncogenes, and sequenced to a depth of 
~500x. Plotted are z-scores of hypermethylation (y-axis, exponential), 
relative to the more normoxic tumours (that is, empty). The dotted line 
represents the threshold at 5% FDR, and bold darker dots the tumours 
displaying significantly increased hypermethylation events. g-j, Relative 
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weights of tumours from tg) MMTV-PyMT) mice, hydrodynamically 
injected with an empty (grey, n = 19) or sFlk1-overexpression plasmid 
(red, n = 24) (g), and corresponding RNA expression of Ptprc (the gene 
encoding CD45, n=5) (h), of Tet enzymes (i, n= 15 for empty plasmid, 
n= 12 for sFlk1-overexpressing plasmid) and of cell proliferation markers 
(j, n=5 for each). k-m, As in d-f but for 16-week old transgenic MMTV- 
PyMT mice of the indicated genotype. n=5 (k), n=3 for Phd2*/*; n=4 
for Phd2*’~ (1) and n=9 (m). n, As in d, but for 16-week-old Tie2- 
Cre;Tg(MMTV-PyMT) mice of the indicated genotypes (n= 5). 0, DNA 
was extracted from 17 breast tumours developing in Tie2-Cre;Phd2™W7, 
Tg(MMTV-PyMT) mice (blue) and 13 breast tumours developing in 
Tie2-Cre;Phd2“ /W1;Tg(MMTV-PyMT) mice (grey), all 16 weeks old. 
DNA was bisulfite-converted, PCR-amplified for the indicated TSGs 
(left) and oncogenes (middle) and sequenced to a depth of >500x. 
Plotted are z-scores of hypermethylation (y axis, exponential), relative 

to the more normoxic, Phd2“™", tumours. The dotted line represents 

the threshold for a Bonferroni-adjusted P < 0.05, and bold darker dots 
the tumours displaying significantly increased hypermethylation events. 
Right, 5hmC levels + s.e.m. across a metagene in tumours of 16-week-old 
mice with the indicated genotype (n= 3 for Phd2™"; n = 4 for Phd2W™"), 
p-u, Relative weights of tumours from Phd2*/~;tg(MMTV-PyMT) mice 
and Phd2*/*;Tg(MMTV-PyMT) mice (n= 10 and 9 resp.) (p-r) and 
from Tie2-Cre;Phd2“?;Tg(MMTV-PyMT) and Tie2-Cre;Phd” 1/7; 
Tg(MMTV-PyMT) mice (n= 17 and 13, respectively) (s-u), and the 
corresponding RNA expression of cell proliferation markers (n=5, p, s), 
of Tet enzymes (n=5, q, t) and of Ptpre (n=5) (r, u). #P < 0.10, *P < 0.05, 
**P < 0.01, ***P < 0.001 by t-test. 
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A combined transmission spectrum of the 
Earth-sized exoplanets TRAPPIST-1 b and c 
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Three Earth-sized exoplanets were recently discovered close to the 
habitable zone!” of the nearby ultracool dwarf star TRAPPIST-1 
(ref. 3). The nature of these planets has yet to be determined, as 
their masses remain unmeasured and no observational constraint is 
available for the planetary population surrounding ultracool dwarfs, 
of which the TRAPPIST-1 planets are the first transiting example. 
Theoretical predictions span the entire atmospheric range, from 
depleted to extended hydrogen-dominated atmospheres* *. Here we 
report observations of the combined transmission spectrum of the 
two inner planets during their simultaneous transits on 4 May 2016. 
The lack of features in the combined spectrum rules out cloud-free 
hydrogen-dominated atmospheres for each planet at > 10a levels; 
TRAPPIST-1 b and c are therefore unlikely to have an extended 
gas envelope as they occupy a region of parameter space in which 
high-altitude cloud/haze formation is not expected to be significant 
for hydrogen-dominated atmospheres’. Many denser atmospheres 
remain consistent with the featureless transmission spectrum—from 
a cloud-free water-vapour atmosphere to a Venus-like one. 

On 4 May 2016, we observed the simultaneous transits of the Earth- 
sized planets TRAPPIST-1b and TRAPPIST-1c with the Hubble Space 
Telescope (HST). This rare event was phased with HST’s visibility win- 
dow of the TRAPPIST-1 system, allowing for complete monitoring of 
the event (Fig. 1). Observations were conducted in ‘round-trip’ spatial 
scanning mode’” using the near-infrared (1.1-1.7,1m) G141 grism on 
the wide-field camera 3 (WFC3) instrument (see Methods). Following 
standard practice, we monitored the transit event through four HST 
orbits, taking observations before, during and after the transit event 
to acquire accurate stellar baseline flux levels. We discarded the first 
orbit owing to differing systematics caused by the thermal settling of 
the telescope following target acquisition!'"!?. The raw light curve 
presents primarily ramp-like systematics on the scale of HST orbit- 
induced instrumental settling, discussed in previous WFC3 transit 
studies!}124 (Fig. 1). We reduced, corrected for instrumental 
systematics, and analysed the data using independent methods 
(see Methods) that yielded consistent results. We reached an average 
standard deviation of normalized residuals (SDNR) of 650 parts per 
million (p.p.m.) per 112-second exposure (Fig. 2) on the spectropho- 
tometric time series split in 11 channels (resolution = \/AA 35). 
Summing over the entire WFC3 spectral range, we derived a ‘white’ 
light curve with a 240-p.p.m. SDNR (Fig. 1). 

We first analysed the fitting of the white-light curve for the transits of 
TRAPPIST-1b and TRAPPIST-1c simultaneously, while accounting for 
instrumental systematics. Owing to the limited phase coverage of HST 
observations, we fixed the system's parameters to the values provided 
in the discovery report? while estimating the transit times and depths. 
However, we let the band-integrated limb-darkening coefficients 
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Figure 1 | Hubble/WFC3 white-light curve for the TRAPPIST-1b and 
TRAPPIST- 1c double transit of 4 May 2016. a, Raw normalized white- 
light curve (triangles), highlighting the primary instrumental systematics 
(the forward/reverse flux offset and the ramp; see Methods). The shaded 
areas represent time windows during which no exposure was taken owing 
to occultation by the Earth. b, Normalized and systematics-corrected 
white-light curve (black points) and best-fit transit model (blue line). 

The individual contributions of TRAPPIST-1b and TRAPPIST- 1c are 
shown in green and red, respectively. c, Best-fit residuals with their 1o 
error bars (SDNR= 240 p.p.m.). 


(LDCs) and the orbital inclinations for planets b and c (ip and i, 
respectively) float under the control of priors, to propagate their uncer- 
tainties on the transit depth and time estimates with which they may 
be correlated. These priors were derived from the PHOENIX model 
intensity spectra’® for the LDCs (see Methods) and from the discovery 
report? for the planets’ orbital inclinations. We find that TRAPPIST-1c 
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Figure 2 | Hubble/WFC3 spectrophotometry of the TRAPPIST-1b 
and TRAPPIST-1c double transit of 4 May 2016. a, Normalized and 
systematics-corrected data (points) and best-fit transit model (solid line) 
in 11 spectroscopic channels spread across the WFC3 band, offset for 


began its transit 12 minutes before TRAPPIST- 1b (transit time centres 
at barycentric Julian date (Byp)/barycentric coordinate time (TBD) 
—2457512: Typ = 0.88646 + 0.00030 and Tp,.= 0.88019 + 0.00016; 
transit durations*: W, = 36.12 + 0.46 min and W.=41.78 £0.81 min). 
The difference between the planets’ transit duration of 5.6 + 0.9 min 
implies that no planet-planet eclipse!® occurred during the observed 
event, given the well established orbital periods. Standard transit 
models!” are therefore adequate for analysing this data set. We find 
an orbital inclination and transit depth across the full WFC3 band of 
ip = 89.39° + 0.32° and AF, = 8,015 + 220 p.p.m. for TRAPPIST- 1b, 
and i, = 89.58° +0.11° and AF. =7,290 + 240 p.p.m. for TRAPPIST-1c. 

In the context of double-transit observations, the data primarily 
constrain the combined transit depths (AF, = 15,320 + 160 p.p.m.). 
Therefore, although the partial transit of TRAPPIST-1c—before 
TRAPPIST- 1b begins its transit—yields some constraints on AF, it 
is not sufficient to completely lift the degeneracy between AF, (being 
AF»y+-c— AF.) and AF.. This explains the ~30% better precision 
obtained with the combined transit depth, and hence also with the 
combined transmission spectrum. The transit depths derived over 
WFC3’s band are in agreement, within 20, with the values reported 
at discovery’. 

We then analysed the light curves in 11 spectroscopic channels, 
fitting for wavelength-dependent transit depths, instrumental system- 
atics, and stellar baseline levels (Fig. 2). We tried both quadratic and 
four-parameter limb-darkening relationships'® for each spectroscopic 
channel, because transit depth estimates may depend on the functional 
form used to describe limb darkening. We found, however, that our 
conclusions are not sensitive to which limb-darkening relationship was 
chosen, as long as the wavelength dependence of the LCDs is taken 
into account. The resulting transmission spectra are consistent with 
a flat line (Fig. 3). 

Figure 3 shows the transit depth variations expected over the 
WEC3 band if TRAPPIST-1b and/or TRAPPIST- 1c were harbouring 
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clarity. The individual contributions of TRAPPIST-1b and TRAPPIST-1c 
are shown with dotted and dashed lines, respectively. b, Best-fit residuals 
with their 1o error bars (channel-averaged SDNR= 650 p.p.m.). 


ed 
rte) 
a 


a cloud-free hydrogen-dominated atmosphere (red lines and circles 
in Fig. 3). Our transmission spectrum model” sets atmospheric 
temperature to the planet's equilibrium temperature (T.g,, = 366 K 
and Teg, = 315 K), assuming a Bond albedo of 0.3. Because the 
planetary masses remain unmeasured, we conservatively use a 
mass of 0.95Me and 0.85M.~ for TRAPPIST-1b and TRAPPIST- 1c 
respectively (Mz being the mass of Earth); these are the maximum 
masses that allow them to possess hydrogen/helium envelopes greater 
than 0.1% of their total masses given their radii”°. The precision 
achieved with the combined transmission spectrum (~350 p.p.m. 
per bin) is sufficient to detect the presence of a cloud-free hydrogen- 
dominated atmosphere via the detection of water or methane absorption 
features. The featureless spectra rule out a cloud-free, hydrogen- 
dominated atmosphere for TRAPPIST- 1b and TRAPPIST- 1c at the 120 
and 10q level, respectively. 

We also show in Fig. 3 alternative atmospheres for TRAPPIST-1b 
and TRAPPIST- 1c that are consistent with the data; volatile (water)- 
rich atmospheres and hydrogen-dominated atmospheres with a cloud 
deck at 10 mbar are shown in blue and in yellow, respectively. Many 
alternatives for the atmospheres of TRAPPIST-1b and TRAPPIST- 1c 
still remain. The atmospheric screening of sub-Neptune-sized 
exoplanets using existing observatories is a step-by-step process'4?!”, 
As for the super-Earth-sized planet GJ 1214b (ref. 21), the first 
observations of TRAPPIST-1’s planets with HST allow us to rule out 
a cloud-free hydrogen-dominated atmosphere for either planet. If 
the planets’ atmospheres are hydrogen-dominated, then they must 
contain clouds or hazes that are grey absorbers between 1.1 jum and 
1.7 jum at pressures less than around 10 mbar. However, theoretical 
investigations for hydrogen-dominated atmospheres’ predict that 
the efficiencies of haze and cloud formation at the irradiation levels 
of TRAPPIST-1b and TRAPPIST- 1c should be dramatically reduced 
compared with, for example, the efficiencies for GJ 1214b (insolation 
ratios: Sgy1214b/ Sp & 43 Scyizi14b/S¢¥ 8), leading to cloud formation at 
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Figure 3 | Transmission spectra of TRAPPIST-1b and TRAPPIST-1c 
compared with models. a-c, Theoretical predictions of TRAPPIST-1b’s 
transmission spectrum (b), TRAPPIST-1c’s spectrum (c), and their 
combinations (a) are shown for cloud-free H2-dominated atmospheres 
(red lines and circles), H2-dominated atmospheres with a cloud deck 

at 10 mbar (yellow lines and circles), and cloud-free H,O-dominated 


pressure levels of 100 mbar or more, with marginal effects on their 
transmission spectra’®. In short, hydrogen-dominated atmospheres 
can be considered as unlikely for TRAPPIST-1b and TRAPPIST-Ic. 
Planets with the sizes and equilibrium temperatures of TRAPPIST-1b 
and TRAPPIST-1c could possess relatively thick H2O-, CO2-, No- or 
O,-dominated atmospheres, or potentially tenuous atmospheres 
composed of a variety of chemical species*®”*. All of these 
denser atmospheres are consistent with our measurements. The ampli- 
tude ofa planet’s transmission spectrum scales directly with its atmos- 
pheric mean molecular weight, yu. The amplitude of an exoplanet’s 
transmission spectrum can be expressed as 2R phe / R’, where R, and 
R, are the planetary and stellar radii, and hg is the effective atmospheric 
height (that is, the extent of the atmospheric annulus), which is directly 
proportional to the atmospheric scale height, H=kT/ jug, where k is 
Boltzmann’s constant, T is the atmospheric temperature, and gis the 
surface gravity. Therefore, everything else being equal, the transmission 
spectrum amplitude of a denser atmosphere is significantly damped 
compared with the one of a hydrogen-dominated atmosphere (for 
example, by a factor of about seven for a HO-dominated atmosphere). 
As a result, no constraint on the presence and minimum pressure level 
of clouds/hazes for such denser atmospheres can be inferred from our 
data. TRAPPIST-1b and TRAPPIST-1c could, for instance, harbour a 
cloud-free water-vapour atmosphere or a Venus-like atmosphere with 


Wavelength (um) 


atmospheres (blue lines and circles). The coloured circles show the binned 
theoretical models. The feature at 1.41m arises from water absorption. 
The significance of the deviation of each transmission spectrum from 

the WFC3 measurements (black circles with 1c error bars) is listed in 
parentheses in each panel. 


high-altitude hazes™*”°. We shall be able soon to distinguish between 
such atmospheres. The transmission spectrum of Venus as an exoplanet 
would present broad variations of about 2 p.p.m. from 0.2 1m to 
5 um (ref. 26), which, rescaled to the TRAPPIST-1 star, correspond to 
variations of about 160 p.p.m. (2 X Rgun/Rrpappist_)—currently below 
our errors, but eventually reachable. 

Screening TRAPPIST-1’s Earth-sized planets now—to distinguish 
progressively between their plausible atmospheric regimes, and to 
determine their amenability for detailed atmospheric studies—will 
allow the optimization of follow-up studies with the next generation 
of observatories. Our work highlights HST/WFC3’s ability to perform 
the first step towards a thorough understanding of these planets’ 
atmospheric properties. 


Online Content Methods, along with any additional Extended Data display items and 
Source Data, are available in the online version of the paper; references unique to 
these sections appear only in the online paper. 


Received 18 May; accepted 4 June 2016. 
Published online 20 July 2016. 


1. Kopparapu, R. K. et al. Habitable zones around main-sequence stars: 
new estimates. Astrophys. J. 765, 131 (2013). 

2. Zsom,A., Seager, S., de Wit, J. & Stamenkovic, V. Towards the minimum 
inner edge distance of the habitable zone. Astrophys. J. 778, 109 (2013). 


1 SEPTEMBER 2016 | VOL 537 | NATURE | 71 


© 2016 Macmillan Publishers Limited, part of Springer Nature. All rights reserved. 


LETTER 


22. 


Gillon, M. et a/. Temperate Earth-sized planets transiting a nearby ultracool 
dwarf star. Nature 533, 221-224 (2016). 

Owen, J. E. & Wu, Y. Kepler planets: a tale of evaporation. Astrophys. J. 775, 
105 (2013). 

Jin, S. et al. Planetary population synthesis coupled with atmospheric escape: 
a statistical view of evaporation. Astrophys. J. 795, 65 (2014). 

Johnstone, C. P. et a/. The evolution of stellar rotation and the hydrogen 
atmospheres of habitable-zone terrestrial planets. Astrophys. J. 815, L12 
(2015). 

Luger, R. & Barnes, R. Extreme water loss and abiotic O2 buildup on planets 
throughout the habitable zones of M dwarfs. Astrobiology 15, 119-143 (2015). 
Owen, J. E. & Mohanty, S. Habitability of terrestrial-mass planets in the HZ 

of M dwarfs. |. H/He-dominated atmospheres. Mon. Not. R. Astron. Soc. 459, 
4088-4108 (2016). 

Morley, C. V. et al. Thermal emission and reflected light spectra of super Earths 
with flat transmission spectra. Astrophys. J. 815, 110 (2015). 


. McCullough, P. & MacKenty, J. Considerations for using spatial scans with WFC3. 


Instr. Sci. Report WFC3 2012-08 (Space Telescope Science Institute, 2012). 


. Deming, D. et al. Infrared transmission Spectroscopy of the exoplanets HD 


209458b and XO-1b using the wide field camera-3 on the Hubble Space 
Telescope. Astrophys. J. 774, 95 (2013). 


. Wakeford, H. R., Sing, D. K., Evans, T., Deming, D. & Mandell, A. Marginalizing 


instrument systematics in HST WFC3 transit light curves. Astrophys. J. 819, 
10 (2016). 


. Sing, D. K. et a/. A continuum from clear to cloudy hot-Jupiter exoplanets 


without primordial water depletion. Nature 529, 59-62 (2016). 


. Kreidberg, L. et al. Clouds in the atmosphere of the super-Earth exoplanet 


GJ1214b. Nature 505, 69-72 (2014). 


. Husser, T.-O. et a/. A new extensive library of PHOENIX stellar atmospheres and 


synthetic spectra. Astron. Astrophys. 553, A6 (2013). 


. Hirano, T. et al. Planet-planet eclipse and the Rossiter-McLaughlin effect of a 


multiple transiting system: joint analysis of the Subaru spectroscopy and the 
Kepler photometry. Astrophys. J. 759, L36 (2012). 


. Mandel, K. & Agol, E. Analytic light curves for planetary transit searches. 


Astrophys. J. 580, L171-L175 (2002). 


. Sing, D. K. Stellar limb-darkening coefficients for CoRot and Kepler. Astron. 


Astrophys. 510, A21 (2010). 


. de Wit, J. & Seager, S. Constraining exoplanet mass from transmission 


spectroscopy. Science 342, 1473-1477 (2013). 


. Howe, A. R., Burrows, A. & Verne, W. Mass-radius relations and core-envelope 


decompositions of super-Earths and sub-Neptunes. Astrophys. J. 787, 173 
(2014). 


. Bean, J. L., Miller-Ricci Kempton, E. & Homeier, D. A ground-based 


transmission spectrum of the super-Earth exoplanet GJ 1214b. Nature 468, 

669-672 (2010). 

Berta, Z. K. et al. The flat transmission spectrum of the super-Earth GJ1214b 
from wide field camera 3 on the Hubble Space Telescope. Astrophys. J. 747, 

35 (2012). 


72 | NATURE | VOL 537 | 1 SEPTEMBER 2016 
© 2016 Macmillan Publishers Limited, part of Springer Nature. All rights reserved. 


23. Leconte, J., Forget, F. & Lammer, H. On the (anticipated) diversity of terrestrial 
planet atmospheres. Exp. Astron. 40, 449-467 (2015). 

24. Tellmann, S., Patzold, M., Hausler, B., Bird, M. K. & Tyler, G. L. Structure of the 
Venus neutral atmosphere as observed by the Radio Science experiment VeRa 
on Venus Express. J. Geophys. Res. Planets 114, EOOB36 (2009). 

25. Wilquet, V. et al. Preliminary characterization of the upper haze by SPICAV/ 
SOIR solar occultation in UV to mid-IR onboard Venus Express. J. Geophys. Res. 
Planets 114, EOOB42 (2009). 

26. Ehrenreich, D. et al. Transmission spectrum of Venus as a transiting exoplanet. 
Astron. Astrophys. 537, L2 (2012). 


Acknowledgements This work is based on observations made with the 
NASA/ESA Hubble Space Telescope that were obtained at the Space Telescope 
Science Institute, which is operated by the Association of Universities for 
Research in Astronomy, Inc. These observations are associated with program 
HST-GO-14500 (principal investigator J.d.W.), support for which was provided 
by NASA through a grant from the Space Telescope Science Institute. The 
research leading to our results was funded in part by the European Research 
Council (ERC) under the FP/2007-2013 ERC grant 336480, and through an 
Action de Recherche Concertée (ARC) grant financed by the Wallonia-Brussels 
Federation. H.R.W. acknowledges support through an appointment to the NASA 
Postdoctoral Program at Goddard Space Flight Center, administered by the 
Universities Space Research Association through a contract with NASA. M.G. is 
Research Associate at the Belgian Fonds (National) de la Recherche Scientifique 
(FRS-FNRS). L.D. acknowledges support of the Fund for Research Training in 
Industry and Agriculture of the FRS-FNRS. We thank D. Taylor, S. Deustua, 

P. McCullough, and N. Reid for their assistance in planning and executing our 
observations. We are also grateful for discussions with Z. Berta-Thompson 

and Pierre Magain about this study and manuscript. We thank the ATLAS and 
PHOENIX teams for providing stellar models. 


Author Contributions J.d.W. and H.R.W. led the data reduction and 
analysis, with the support of M.G., N.K.L. and B.-O.D. J.d.W., H.R.W., 

and N.K.L. led the data interpretation, with the support of M.G. and J.A.V. 
J.A.V. provided the limb-darkening coefficients and further insights into 
TRAPPIST-1’s properties and emission together with A.J.B. Every author 
contributed to writing both the manuscript and the HST proposal behind 
these observations. 


Author Information Reprints and permissions information is available at 
www.nature.com/reprints. The authors declare no competing financial 
interests. Readers are welcome to comment on the online version of the 
paper. Correspondence and requests for materials should be addressed to 
J.d.W. (jdewit@mit.edu). 


Reviewer Information Nature thanks D. Ehrenreich and the other anonymous 
reviewer(s) for their contribution to the peer review of this work. 


METHODS 

HST WEC3 observations. We observed the transit of TRAPPIST-1c followed 
12 minutes later by the transit of TRAPPIST- 1b on 4 May 2016. Observations were 
conducted using the HST/WFC3 infrared G141 grism (1.1-1.7 1m) in round-trip 
scanning mode”. Using the round-trip scanning mode involves exposing the 
telescope during an initial forward slew in the cross-dispersion direction, and 
exposing during an equivalent slew in the reverse direction (details on the trade- 
offs behind round-trip scanning are below). Scans were conducted at a rate of 
~0.236 pixels per second, with a final spatial scan covering ~26.4 pixels in the 
cross-dispersion direction on the detector. 

We use the IMA output files from the CalWF3 pipeline, which have been 
calibrated using flat fields and bias subtraction. We applied two different extraction 
techniques which lead to the same conclusions. The first technique extracts the flux 
for TRAPPIST-1 from each exposure by taking the difference between successive 
non-destructive reads. A top-hat filter’” is then applied around the target spectrum, 
measured +18 pixels from the centre of the TRAPPIST-1 scan, and sets all external 
pixels to zero. Next, the images are reconstructed by adding the individual reads 
per exposure back together. Using the reconstructed images, we extracted the 
spectra with an aperture of 31 pixels around the computed centring profile for 
both forward and reverse scan observations. The centring profile is calculated on 
the basis of the pixel flux boundaries of each exposure, which was found to be fully 
consistent across the spectrum for both scan directions. 

The second technique uses the final science image for each exposure and 
determines for each frame the centroid of the spectrum in a box 28 pixels by 
136 pixels, which corresponds to the dimensions of the irradiated region of WFC3’s 
detector for our present observations. It then extracts the flux for 120 apertures 
of sizes ranging along the dispersion direction from 24 pixels to 38 pixels (with 
1-pixel increments), and along the cross-dispersion direction from 120 pixels to 
176 pixels (with 8-pixel increments)—we found the SDNR to be mostly insensitive 
to the aperture size along the dispersion direction. The best aperture was selected 
via minimization of the SDNR of the white-light-curve best fit, which is minimum 
for an aperture of 32 pixels by 157 pixels. 

Both techniques subtract the background for each frame by selecting a region 
well away from the target spectrum, calculating the median flux, and cleaning 
cosmic-ray detections with a customized procedure”*. Our observations present 
three cosmic-ray detections that were not flagged by the CalWF3 pipeline. The 
exposure times were converted from Julian date in universal time (JpUT) to the 
barycentric Julian date in the barycentric dynamical time (BypTDB) system’. Both 
extraction methods result in the same relative flux measurements from the star 
and SDNR (~240 p.p.m. in the white-light curve), as the build-up of flux over 
successive reads is stable. 

We elected to obtain our observations using the round-trip scan mode in order 
to increase the integration efficiency compared with the standard forward scan 
mode. We note that, owing to slight differences in scan length/position and to 
the way in which the detector is read out (that is, if the direction of the scan is 
in the same direction as the column readout, then the integration time will be 
marginally longer than if the reverse were true’), round-trip scan mode results in 
measurable differences in the total flux of the forward scan exposures compared 
with the reverse scan exposures. This effect has been seen for previous WFC3 
observations!**° in round-trip mode, and has been corrected for in two main ways. 

The first method involves splitting the data into two sets, one for forward scan 
exposures and one for reverse scan exposures, effectively halving the number of 
exposures per light curve, but doubling the number of light curves obtained. Each 
of these data sets is then analysed separately and the results combined at the end". 
The second method uses the median of each scan direction to normalize the two 
light curves, which are then recombined and normalized before the light-curve 
analysis to obtain the transit parameters*’. In the TRAPPIST-1 data, we measure 
a ~0.1% difference in flux level between the two scans. Because of the limited 
phase coverage of the combined transits, to retain the most information about 
the combined and separate effects of each planet (the transit of TRAPPIST-1c 
followed by that of TRAPPIST-1b), we cannot apply the first method. However, 
by applying the second method we found significant remaining structure in the 
residuals, suggesting that the correction is only partial. Previous observations using 
the round-trip scan*’ show that the offset between the light curves obtained with 
each scan varies significantly from orbit to orbit, suggesting that correcting via a 
median combine across visits is not optimal. In addition, the total flux is affected 
asymmetrically by other instrumental systematics—for example, the detector ramp 
consistently yields a first measurement in the forward direction that is significantly 
lower than average—thus biasing the median combine. Therefore, we corrected 
for the flux offset induced by the round-trip scan mode on the basis of the offset in 
the residuals for each HST orbit individually. To do so, we estimate in our forward 
model the ‘intermediate residuals, based on the data corrected for the transit model 
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and the instrumental systematics. For each orbit, we estimate the mean of these 
residuals for each scan direction (m,and m,, for the mean of the residuals of the 
forward-scan exposures and the reverse-scan exposures, respectively). The ratio of 
the fluxes measured in reverse-scan exposures to the shared baseline level is 1 + m,3 
the ratio is 1 + me for forward-scan exposures. We therefore correct for their offsets 
by dividing each set of exposures by their respective ratio. 

HST WFC3 white-light curve and spectroscopy. We first analysed the white- 
light curve by summing the flux across all wavelengths. We fitted the transits 
of TRAPPIST-1b and TRAPPIST- 1c by using the transit model of ref. 17, while 
correcting for instrumental systematics. We followed the standard procedure for 
analysing HST/WFC3 data by fixing the planets’ orbital configurations—all but 
the orbital inclinations, which are currently poorly constrained for TRAPPIST- 
1’s planets—to the ones reported in the discovery report*, while determining the 
transit times and depths. We used priors on the band-integrated limb-darkening 
coefficients (LDCs) derived from the PHOENIX model intensity spectra’’, and 
on the planets’ orbital inclinations—these parameters being potentially correlated 
with the transit depth estimates—to adequately account for our present state of 
knowledge on TRAPPIST-1. We used different analysis methods to confirm the 
robustness of our conclusions. 

The first method uses a least-squares minimization fitting (L-M) implementation’? 
to investigate a large sample of systematic models—which include corrections in 
time, HST orbital phase, and positional shifts in wavelength on the detector— 
and marginalize over all possible combinations to obtain the transit parameters. 
The L-M implementation fits the light curves for each systematic model and 
approximates the evidence-based weight of each systematic model using the 
Akaike information criterion*’. It does so while keeping the LDCs fixed to the 
best estimates presented below, and the orbital inclinations fixed to the estimates 
from ref. 3. The highest weighted systematic models include linear corrections 
in time, as well as linear corrections in HST orbital phase or in the shift in wave- 
length position over the course of the visit. Therefore, using marginalization across 
a grid of stochastic models allows us to account for all tested combinations of 
systematics and to obtain robust transit depths for both planets, separately and in 
combination. For this data set, the evidence-based weight approximated for each of 
the systematic models applied to the data indicates that all of the systematic models 
fit equally well to the data, and that no one systematic model contributes to the 
majority of the corrections required to obtain the precision presented (Extended 
Data Fig. 1). In other words, instrumental systematics affect our observations only 
marginally. We carried out independent analyses of the data by using adaptive 
Markov chain Monte Carlo (MCMC) implementations*”*?. For each HST light 
curve, the transit models!” of TRAPPIST-1b and TRAPPIST- 1c are multiplied 
by baseline models that account for the visit-long trend observed in WFC3 light 
curves, WFC3’s ramp, and the ‘HST breathing’ effect!. For these analyses, priors 
are used for the LDCs and the orbital inclinations. We find that the visit-long 
trend is adequately accounted for with a linear function of time, the ramp with a 
single exponential in time, and the breathing with a second-order polynomial in 
HST’s orbital phase. More-complex baseline models were tested and gave consistent 
results, as revealed by the marginalization study. 

We calculated the transmission spectrum by fitting the transit depth of 

TRAPPIST-1b and TRAPPIST-1c simultaneously in each spectroscopic light curve. 
We divided the spectral range between 1.15,1m and 1.7 {1m into 11 equal bins of 
AX=0.05 um. We applied again the two techniques described above to analyse each 
spectroscopic light curve, resulting in the combined and independent transmission 
spectra of TRAPPIST-1b and TRAPPIST-Ic. An L-M implementation” and the 
adaptive MCMC implementations produced consistent results for each stage of 
the analysis. 
Limb-darkening coefficients. We determined limb-darkening coefficients by 
fitting theoretical specific intensity spectra (I) downloaded from the Gottingen 
spectral library (http://phoenix.astro.physik.uni-goettingen.de/?page_id=73), 
which is described in ref. 15. The intensity spectra are provided on a wavelength 
grid with 1-A cadence for 78 1 values, where j1 is the cosine of the angle between 
an outward radial vector and the direction towards the observer at a point on the 
stellar surface. We integrated I over one broad and 11 narrow wavelength intervals, 
used in our analysis of the transit light curve. We divided I for each wavelength 
interval by [., the value of I at the centre of the stellar disc (where ju= 1). 

Because the PHOENIX code calculates specific intensity spectra in spherical 
geometry, the PHOENIX ,1 grid extends above the stellar limb relevant to exoplanet 
transit calculations. When fitting limb-darkening functions, PHOENIX : values 
should be scaled to yield ju’ =0 at the stellar radius**. We define ju! = (su — to)/(1 — uo), 
where I/I.=0.01 at j1= /1o. The value of jo is a function of wavelength. We then 
fitted two commonly used functional forms for limb darkening"*: 


I/I.=1—a(1 =p!) — (1—p'? 
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and 


I[fe=1 (!)") — a W!) (w’)*?) 
When fitting, we ignored points with ju’ < 0.05. 

Extended Data Fig. 2 shows the limb-darkening fits for the 12 wavelength 

intervals in our transit light curve analysis. We calculated fits for four stellar models 
with effective temperatures of 2,500 K and 2,600 K and logarithmic surface gravities 
of 5.0 and 5.5. We then linearly interpolated the limb-darkening coefficients to an 
effective temperature of 2,550 K and gravity 5.22, appropriate for TRAPPIST-1 
(ref. 3). 
Transmission spectrum models. We simulated the theoretical spectra for 
TRAPPIST-1b and TRAPPIST- 1c using the model introduced in ref. 19. We used 
atmospheric temperatures equal to the planets’ equilibrium temperatures assuming 
a Bond albedo of 0.3 (these temperatures being 366 K for TRAPPIST-1b and 315K 
for TRAPPIST-1c). The use of isothermal temperature profiles set at the equilib- 
rium temperatures is conservative, as it does not account for possible additional 
heat sources or temperature inversion and results in a possible underevaluation of 
the atmospheric scale height. Our assumption regarding the temperature profiles 
does not affect our conclusion; variations of 50 K (that is, ~15%) in the atmos- 
pheric temperature modify the amplitude of the transmission spectra by up to 
~15%, because at first order their amplitudes scale with the temperature. The 
planetary masses being unconstrained, we conservatively use a mass of 0.95M and 
0.85Mz for TRAPPIST-1b and TRAPPIST- 1c respectively—the maximum masses 
that would allow them to possess hydrogen/helium envelopes greater than 0.1% 
of their total masses given their radii’. We use the atmospheric compositions of 
the ‘mini-Neptune and ‘Halley world’ models introduced in ref. 35 to simulate 
the hydrogen-dominated and water-dominated atmospheres, respectively. We 
simulated the effect of optically thick cloud or haze at a given pressure level by 
setting to zero the transmittance of atmospheric layers with a higher pressure. 

The feature at 1.4|1m arises from water absorption; the feature at 1.15 1m for 
the water-dominated atmosphere arises from methane absorption. We compared 
the transmission spectra, allowing for a vertical offset to account for our a priori 
ignorance of the optically thick radius by setting the mean of each spectrum to zero. 
The significance of the deviation of each transmission spectrum from the WFC3 
measurements is shown in Fig. 3. Significance levels less than 3c mean that the 
data are consistent with that model within the reported errors. 


a(1 o3(1 


We rule out the presence of a cloud-free hydrogen-dominated atmosphere for 

either planet at the 10a level through the combined transmission spectrum (and at a 
lesser 7c level through their individual spectra). The measurements are consistent with 
volatile (for example, water)-rich atmospheres or hydrogen-dominated atmospheres 
with optically thick clouds or hazes located at larger pressures than 10 mbar. 
Code availability. Conversion of the ur times for the photometric measurements 
to the BJDTBD system was carried out using the online program created by 
J. Eastman and distributed at http://astroutils.astronomy.ohio-state.edu/time/ 
utc2bjd.html. We have opted not to make available the codes used for data 
extraction as they are an important part of the researchers’ toolkits. For the same 
reason, we have opted not to make available all but one of the codes used for 
data analysis. The MCMC software used by M.G. to analyse independently the 
photometric data is a custom Fortran 90 code that can be obtained upon request. 
The custom IDL code used to determine limb-darkening coefficients can be 
obtained upon request. 
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systematic model!” applied to the white-light curve. b, Combined uncertainties. The scale of the values here indicates that all of the 
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Extended Data Figure 2 | TRAPPIST-1’s limb darkening. Stellar limb- 
darkening relationships for TRAPPIST-1 (black curves) and four stellar 
models (coloured curves) that bracket the effective temperature and 
surface gravity of TRAPPIST-1 (shown in coloured and black numbers 
in a; temperature is in K and surface gravity is expressed in log(g). The 
circles are theoretical'* specific intensities (I) relative to disc centre (I.) as 
a function of 1’ (the cosine of the angle between an outward radial vector 
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and the direction towards the observer). We fitted I/I. averaged over the 
indicated wavelength intervals to determine the quadratic (dashed curves) 
and four-parameter (solid curves) limb-darkening coefficients. a, Stellar 
limb-darkening relationship integrated over WFC3’s spectral band. 

b-1, Stellar limb-darkening relationship over the 11 spectral channels 
used here. 
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Comets are thought to preserve almost pristine dust particles, 
thus providing a unique sample of the properties of the early solar 
nebula. The microscopic properties of this dust played a key part 
in particle aggregation during the formation of the Solar System’. 
Cometary dust was previously considered to comprise irregular, fluffy 
agglomerates on the basis of interpretations of remote observations 
in the visible and infrared? and the study of chondritic porous 
interplanetary dust particles’ that were thought, but not proved, to 
originate in comets. Although the dust returned by an earlier mission® 
has provided detailed mineralogy of particles from comet 81P/Wild, 
the fine-grained aggregate component was strongly modified during 
collection®. Here we report in situ measurements of dust particles at 
comet 67P/Churyumov-Gerasimenko. The particles are aggregates of 
smaller, elongated grains, with structures at distinct sizes indicating 
hierarchical aggregation. Topographic images of selected dust 
particles with sizes of one micrometre to a few tens of micrometres 
show a variety of morphologies, including compact single grains 
and large porous aggregate particles, similar to chondritic porous 
interplanetary dust particles. The measured grain elongations are 
similar to the value inferred for interstellar dust and support the idea 
that such grains could represent a fraction of the building blocks of 
comets. In the subsequent growth phase, hierarchical agglomeration 
could be a dominant process!” and would produce aggregates that stick 
more easily at higher masses and velocities than homogeneous dust 
particles'!. The presence of hierarchical dust aggregates in the near- 
surface of the nucleus of comet 67P also provides a mechanism for 
lowering the tensile strength of the dust layer and aiding dust release’. 

MIDAS, the Micro-Imaging Dust Analysis System'*"4, is the first 
space-borne atomic force microscope (AFM) and a unique instrument 
designed to measure the size, shape, texture and microstructure of come- 
tary dust. Flying on the Rosetta spacecraft, it collects dust on sticky targets 
during passive exposures and images its three-dimensional topography 
with an unprecedented nanometre to micrometre resolution’. 

Cometary dust was first collected in mid-November 2014. Here, we 
focus on particles collected from then until the end of February 2015. 
The collected particles cover a range of sizes from tens of micrometres 
down to a few hundred nanometres, and have various morphologies, 
from single grains to aggregate particles with different packing 
densities. Five examples are presented here. 

Figure 1 shows topographic images (height fields) of three particles 
(A, B and C). We refer to particles A and C as compact, because their 
sub-units (hereafter grains) are tightly packed; particle B appears to be 
a homogeneous grain. The next example (D) is also a compact particle, 
scanned with a higher lateral resolution of 80 nm (Fig. 2)—a factor four 
better than the previous scan. The final particle (E), presented in Fig. 3, 


is best described as a loosely packed, ‘fluffy aggregate comprising many 
grains. Detailed collection times and geometries for all particles can be 
found in Extended Data Figs 1-3. 

Aided by the three-dimensional nature of the data, individual grains 
can be identified, as shown in Figs 1b, 2b and 3b. The properties of these 
particles and their grains are summarized in Table 1 for particles A-D 
and in Fig. 3d for particle E. Because particle E extends beyond the edge 
of the scanned area, only lower limits for its dimensions can be given. 
All further calculations and discussion refer to only this visible region. 
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Figure 1 | AFM topographic images of particles A, B and C and their 
sub-units. a, A 20|1m x 501m overview image with a pixel resolution 

of 312nm and the colour scale representing the height, z. b, As in a, but 
with particle B and the sub-units of particles A and C outlined in cyan. 
c,d, 102m x 101m three-dimensional (rotated) images of particles C and 
A with two-times height exaggeration to aid visualization. 
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Compact particles A and C are both approximately 5.6 um in effective 
diameter (hereafter size; see Methods) and are built from grains in the 
size range 1 a jum to ai jum (where the errors are given as 
the linear addition of the 1o statistical uncertainty and the systematic 
uncertainty; see Methods). The compact grain B is 2.767}'2; jum in size, 
comparable to the dust grains of particles A and C. In fact, the topographic 
image suggests that grain B was originally part of particle C, but detached 
on impact with the target. Particle D is 1.09*99s \1m in size, again similar 
to the grains in A-C. However, the higher resolution reveals that this 
micrometre-sized particle is itself an aggregate of smaller units; seven 
grains can be resolved, with sizes ranging from 260179, nm to540 ae nm. 
The visible part of particle E has a maximum extent of 14|1m in the 


x direction and 37 1m in the y direction. Analysis of its component grains 


(Fig. 3d) shows sizes in the range from 0.587)'5¢ pmto 2.57709 jum, with 


the grain heights ranging between 0.2\1m and 3 jum and with 90% smaller 
than 1.7m. These measurements are evidence for a continuation of the 
aggregate nature of dust particles below the size range observed by the 
COSIMA (Cometary Secondary Ion Mass Analyser) instrument 
on-board Rosetta (tens to hundreds of micrometres)». 

Particle E also shows a morphology that is strongly reminiscent of strat- 
ospheric, chondritic porous interplanetary dust particles (IDPs), which 
have long been suspected of having a cometary origin. This link is con- 
sistent with observations by COSIMA for larger dust particles, which also 
measured similar compositions for dust at comet 67P and IDPs!>"'®. One 
notable difference to IDPs is the extremely flat nature of particle E, which 
has a height that is an order of magnitude lower than its (minimal) lateral 
dimension. Indeed, all of the particles presented here have flattened shapes 
to some degree (see Table 1). It is not yet clear if this is an intrinsic property 
of cometary dust or the result of a rearrangement of grains on impact. 
COSIMA has observed that sub-millimetre aggregate particles undergo 
rearrangement of their grains on impact, producing flattened shapes’. 
Additionally, COSIMA collected small, apparently compact particles that 
are also flat, but the resolution is insufficient to determine if they are sin- 
gle grains or aggregates. On the other hand, cluster-cluster aggregation 
with rotating grains can form elongated structures with very high aspect 
ratios'’, and laboratory experiments have produced dust “flakes”'®. 

Investigation of the size distribution of chondritic porous IDPs and 
fine-grained material returned by the Stardust mission’®”° showed 
that the majority of their component grains are smaller than 500nm 
(refs 20, 21). Figure 3d shows that 90% of the grains in particle E are 
smaller than 21m, comparable to the size of particle D, which is itself 
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Figure 2 | AFM topographic images of particle D and its sub-units. 

a, A 5m x 5m overview image with a pixel resolution of 80 nm and the 
colour scale representing the height, z. b, As in a, but with the sub-units 
of particle D outlined in cyan. c, A three-dimensional (rotated) image of 
the particle with two-times height exaggeration to aid visualization. 
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built from grains smaller than about 500nm. This result suggests that 
the grains of the fluffy aggregate particle E are also aggregates of sub- 
micrometre components similar to those in chondritic porous IDPs, 
and points towards a hierarchical structure. Hierarchical growth (that 
is, aggregates of smaller aggregates) has been proposed as a growth 
mechanism in the protoplanetary disk when fragmentation of larger 
particles provides a population of smaller aggregates available for 
agglomeration’®. The sticking probability of such particles can be higher 
than that of homogeneous dust for a given mass and velocity and need 
to be accounted for in models of dust particle growth''. Hierarchical 
aggregates have also been invoked to produce a surface layer of cometary 
dust with sufficiently low tensile strength to allow for dust release!”. 
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Figure 3 | AFM topographic images of particle E, showing its sub-units 
and their size distribution. a, A 14\1m x 371m overview image with a pixel 
resolution of 210nm and the colour scale representing the height, z. 

b, As in a, but with identified grains outlined in cyan. c, A three-dimensional 
14m x 34m view (corresponding to region indicated by the red dashed 
box in a; rotated and cropped). d, Cumulative distribution of the equivalent 
diameters of the grains (red circles), with error bars in grey (where the errors 
are given as the linear addition of the 1c statistical uncertainty and the 
systematic uncertainty; see Methods). The left scale shows the cumulative 
number of grains and the right scale shows the probability that particles have 
equivalent diameters below the specific values. 
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Table 1 | Size, height and elongation of dust particles A-D and their 
component dust grains 


Type d+ Ad (um) Zmax (jum) Elongation 
Particle A | Compact particle 5.481908 1.79 3.321814 
Grain 1 Dust grain 3.314998 1.79 2.9479.43 
Grain 2 Dust grain 2.6278-08 1.33 3.048 :23 
Grain 3 Dust grain 1.93938 1.57 
Grain 4 Dust grain 2.62+9-08 1.55 1.961998 
Particle Dust grain 2.7619-07 1.02 3.14°9:28 
ParticleC Compact particle 5.794804 1.39 4.777024 
Grain 1 Dust grain 2.666 -93 1.33 2.265:43 
Grain 2 Dust grain 2.574858 1.14 2.80°9.43 
Grain 3 Dust grain 2.181898 1.28 2.234535 
Grain 4 Dust grain 2.4276-08 1.39 23110 
Grain 5 Dust grain 2.31°395 1.38 2.321845 
ParticleD |= Compact particle 1.09+3:92 0.42 3.3618 :33 
Grain 1 Dust grain 0.26*8-98 0.17 1.89°3:42 
Grain 2 Dust grain 0.48*8-93 0.22 2.527020 
Grain 3 Dust grain 0.413% 0.31 1.62°9:33 
Grain 4 Dust grain 0.3318-08 0.25 1.74°9 31 
Grain 5 Dust grain 0.4678-98 0.37 1.53+8-92 
Grain 6 Dust grain 0.547892 0.42 2.007385 
Grain 7 Dust grain 0.2648-98 0.32 2.00°8'$8 


dis the diameter of a circle with equivalent area to that of the particle or grain; Zmax is the maximum 
height above the substrate surface. The errors of the diameters are given as the linear addition of 
the 1a statistical uncertainty and the systematic uncertainty, and the errors of the elongation are 
given as the worst-case estimate; see Methods for further details. For particle A grain 3 and particle D 
grains 4, 6 and 7, the maximal elongation is found for the ratio of the two lateral dimensions that 
are attached with especially large uncertainties. Therefore, an accurate elongation cannot be given 
for particle A grain 3 and the elongations of the grains of particle D have large uncertainties. 
Because MIDAS provides real measurements of the grain shapes, it is 
possible to evaluate which models support the observations. The elon- 
gation of the grains is found by calculating the ratio of the longest and 
shortest perpendicular axis. Further details are described in Methods. 
For particle E, the grain heights are almost all smaller than their in-plane 
diameters, suggesting that it comprises a single layer of grains, allowing 
accurate grain heights to be determined. The elongation is calculated for 
114 grains (the 11 omitted grains show strong distortions due to tip con- 
volution), giving an average elongation of 2.871)" (that is, the largest 
axis is three times longer than the smallest; the uncertainties represent a 
worst-case estimate containing 1c statistical errors and systematic uncer- 
tainties). The compact particles show similar values (Table 1). 
Elongated grains are considered in several models of cometary 
dust. For example, it has been suggested”” that comets aggregate from 
interstellar grains. In ref. 22, the dust grains were modelled as cylinders 
with aspect ratios of 2-4, and good agreement was found between light 
scattering experiments and observations. Other works have similarly 
demonstrated good agreement between simulations using aggregates of 
spheroidal particles and observational data”**. The elongated nature of 
interstellar dust can be inferred from linear polarization of starlight due 
to partially aligned grains”. The core-mantle structure proposed for 
interstellar and cometary dust”° cannot be confirmed by MIDAS data 
alone, but the elongation measurement supports the idea of acommon 
precursor grain, or growth mechanism. 
Online Content Methods, along with any additional Extended Data display items and 


Source Data, are available in the online version of the paper; references unique to 
these sections appear only in the online paper. 
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METHODS 


Data acquisition and calibration. Exposure durations and times were planned 
by estimating the dust flux using the predicted spacecraft position, pointing and a 
dust flux model for comet 67P derived from observational data””. For a graphical 
visualization of the exposure geometries, see Extended Data Figs 1-3. 

MIDAS operates in a slightly different way from most terrestrial AFMs, by 
making a careful approach to the sample at each pixel position and then moving 
away by a so-called retraction distance before moving to the next pixel, resulting 
in long scan times and possible distortion'*"“, Distortion correction is performed 
using scans of on-board calibration targets, and polynomial background correction 
is used to remove height drifts. This procedure was performed with the data used to 
produce Figs 1 and 3. The scan shown in Fig. 2 was much shorter and no substantial 
distortion was observed; hence, only background subtraction was performed. 
Particle and grain heights are measured relative to the substrate surface, which is 
very clear for Figs 1 and 2, but the zero reference level had to be set manually for 
each grain in Fig. 3, because the steps would otherwise distort the measurements. 

The lateral extent of particles and grains is characterized by an effective size (d), 
which is the diameter of a circle with the same area as the projection of all pixels 
forming the unit; unless stated otherwise, all references to size refer to this effective 
value. The peak height (Zmax) is the maximum elevation above the target for a 
given grain. Identification of particles and their sub-units is performed by visual 
inspection of the calibrated data and, when necessary, cross-sections through the 
three-dimensional data are used; see Extended Data Fig. 4. 

For particle E (Fig. 3), a manual levelling of the surface was necessary, owing to 
the visible steps (imaging artefacts). Repeating this manual levelling process several 
times showed that the induced error was negligible. In addition, the height of a 
grain can be measured precisely only if the grain is directly on the surface and not 
on another grain. For particle E, most of the grains seem to fulfil this requirement, 
because the mean heights of the grains are smaller than their mean diameters. 
Error analysis. In principle, because AFM tips cannot be infinitely sharp, the size 
of every particle is overestimated, owing to the tip sample convolution (that is, the 


recorded image reflects a combination of tip and sample shapes). For example, 
a cone-shaped tip with a large opening angle artificially broadens features, as 
depicted in Extended Data Fig. 5. The convolution uncertainty is generously 
estimated here to give an upper limit. Because the particle diameter cannot be 
underestimated by this convolution, the uncertainty interval becomes asymmetric. 
This systematic uncertainty is linearly added to the 1c statistical uncertainty 
generated by the identification of the grains in the scan. Values for sizes and 
respective uncertainties quoted in the text for all particles, depicted in Fig. 3d 
for particle E and presented in Table 1 for particles A-D, reflect this calculation. 
The elongation of particles and grains is calculated by determining their 
equivalent ellipse (the ellipse with the same second-order moments) and choosing 
the maximum ratio of the largest to smallest of (i) the height of the particle to the 
major axis, (ii) the height to the minor axis and (iii) the ratio of the major and minor 
axes. The uncertainties in these ratios take into account the 1c statistical uncertainty 
due to the manual masking of the particles and the systematic uncertainty due to 
the tip-sample convolution for the axis lengths. The ratio of the major to minor axis 
suffers from a large convolution uncertainty that, in some cases (typically particles 
with steep slopes), prevents a clear statement about the orientation. In these cases, 
no elongation is given. The final uncertainty for the ratio is a worst-case estimate 
that overestimates the uncertainty for non-isolated flat grains. 
Code and data availability. Extended Data Table 1 summarizes the key parameters 
for the AFM scans used to produce Figs 1-3. The filenames listed refer to products 
available in the ESA Planetary Science Archive where all data used here are freely 
available (http://www.cosmos.esa.int/web/psa/rosetta). The open-source package 
Gwyddion”* was used to perform calibration, grain identification and analysis 
throughout this paper (http://gwyddion.net/). 
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Extended Data Figure 1 | The geometry of the exposures where and longitude of the point on the comet below the spacecraft (the sub- 
particles A, B and C were collected. All exposures are marked by spacecraft latitude and longitude) in blue and red, respectively. The 
green bars. The top panel shows the distance of Rosetta from the comet heliocentric distance (between the comet and the Sun) during this 


(blue) and the off-nadir angle (red). The lower panel shows the latitude exposure was 2.25 AU. 
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Extended Data Figure 2 | The geometry of the exposures where particle D was collected. All exposures are marked by green bars. The top panel shows 
the distance of Rosetta from the comet (blue) and the off-nadir angle (red). The lower panel shows the sub-spacecraft latitude and longitude in blue and 
red, respectively. The heliocentric distance during this exposure varied between 2.54 au and 2.41 av. 
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Extended Data Figure 3 | The geometry of the exposures where particle E was collected. All exposures are marked by green bars. The top panel shows 
the distance of Rosetta from the comet (blue) and the off-nadir angle (red). The lower panel shows the sub-spacecraft latitude and longitude in blue and 
red, respectively. The heliocentric distance during this exposure varied between 2.85 au and 2.52 av. 
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Extended Data Figure 4 | Topographic cross-sections demonstrating the height. b, Height profiles of the three cross-sections shown in a, 
the identification of sub-units. a, Topographic image of particles A, B demonstrating how sub-grains were identified (blue and green arrows) 
and C. Dashed blue, red and green lines show where the cross-sections of and revealing slopes of 60°-70° with the substrate surface. 


particle A, B and C, respectively, were made. The colour scale represents 
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a) delta-shape tip convoluted height 
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Extended Data Figure 5 | Tip-sample convolution effects. 


a, b, Simulated AFM images (colour scale indicates the height) providing 

a comparison between a spherical particle imaged with an ideal, infinitely 
sharp tip (a) and with a cone-shaped tip with an opening angle of 30° (b), 
which is similar to that of the MIDAS tips'*. c, d, The corresponding cross- 


sections through the centre of the structures (y axis shows the height). 
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b) 30° opening angle convoluted height 
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The black dashed curves show the spherical particle and the blue lines 
depict the topography as measured with infinitely sharp and cone-shaped 
tips, respectively. The measurement of the volume of the spherical particle 
is exaggerated by 25% for the delta-shaped tip and by 50% for the cone- 
shaped tip. The maximum height measurement is not affected by the tip- 
sample convolution. 
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Extended Data Table 1 | Scan parameters of the primary AFM topography scans shown in Figs 1-3 


Figure 1 Figure 2 Figure 3 
target 14 12 12 
cantilever 9 i] 7 
image resolution 256 x 256 256 x 256 192 x 192 
image size 80 x 80 pm? 20 x 20 pm? 40 x 40 pm? 
pixel resolution 312 nm 80 nm 210 nm 
z step size 0.7 nm 0.7nm 0.7 nm 
retraction height 1095 nm 977 nm 734 nm 
duration 1 day, 05:05:33 08:14:15 11:16:30 
start time 2015-04-29T05:21:40Z 2015-03-13T08:44:382Z 2015-01-18T20:59:28Z 
filename IMG_1509813_1512600_054_ZS IMG_1507001_1508813_005_ZS IMG_1501323_1504200_013_ZS 


The number of pixels and the pixel resolution at a given scan size was limited by the time available and chosen to maximize the resolution. The filename corresponds to that used in the Planetary 
Science Archive. 


© 2016 Macmillan Publishers Limited, part of Springer Nature. All rights reserved. 


LETTER 


doi:10.1038/nature18605 


Dynamically encircling an exceptional point 
for asymmetric mode switching 


Jorg Doppler!, Alexei A. Mailybaev?, Julian B6hm®, Ulrich Kuhl?, Adrian Girschik!, Florian Libisch!, Thomas J. Milburn‘, 


Peter Rabl*, Nimrod Moiseyev° & Stefan Rotter! 


Physical systems with loss or gain have resonant modes that decay or 
grow exponentially with time. Whenever two such modes coalesce 
both in their resonant frequency and their rate of decay or growth, 
an ‘exceptional point’ occurs, giving rise to fascinating phenomena 
that defy our physical intuition’°. Particularly intriguing behaviour 
is predicted to appear when an exceptional point is encircled 
sufficiently slowly”, such as a state-flip or the accumulation of a 
geometric phase”. The topological structure of exceptional points 
has been experimentally explored'!"'°, but a full dynamical encircling 
of such a point and the associated breakdown of adiabaticity'**! have 
remained out of reach of measurement. Here we demonstrate that 
a dynamical encircling of an exceptional point is analogous to the 
scattering through a two-mode waveguide with suitably designed 
boundaries and losses. We present experimental results from a 
corresponding waveguide structure that steers incoming waves 
around an exceptional point during the transmission process. In this 
way, mode transitions are induced that transform this device into a 
robust and asymmetric switch between different waveguide modes. 
This work will enable the exploration of exceptional point physics in 
system control and state transfer schemes at the crossroads between 
fundamental research and practical applications. 

Exceptional points (EPs), also called non-Hermitian degeneracies 
or branch points, have turned out to be at the origin of many counter- 
intuitive phenomena appearing in physical systems that experience 
gain or loss'~®. Such external influences on a system require a non- 
Hermitian description that incorporates non-conservation of energy 
resulting from an external input or output. Rather than being merely a 
perturbative correction, gain and loss can entirely turn the behaviour 
of a system upside down when approaching an EP. Consider here, for 
example, the recent demonstrations of unidirectional invisibility??-4, 
loss-induced suppression and revival of lasing’*-’, and single-mode 
lasers with gain and loss”®*”? or directional output®®, all of which were 
realized at or close to an EP. These studies already nicely demonstrate 
the potential of EPs for novel effects and devices, but the full capability 
of EPs can be accessed when the EP is not just approached or swept 
across, but dynamically encircled”*. 

Originally, it was believed that a slow encircling of an EP would result 
in an adiabatic evolution of states and a corresponding state flip’, but 
more recent work has rigorously shown that the same non-Hermitian 
components necessary for the observation of an EP actually prevent 
an application of the adiabatic theorem'*!. Instead, non-adiabatic 
transitions lead to a chiral behaviour, in the sense that encircling an 
EP in a clockwise or a counter-clockwise direction results in different 
final states!*+!8!_ While this fascinating feature has great potential 
for quantum control and switching protocols, it has so far defied any 
experimental realization. This is because to observe the non-adiabatic 
contributions requires a fully dynamical encircling of the EP that goes 
beyond the quasi-static experiments reported so far'~'3. A dynamically 


resolved experiment is, however, extremely challenging, because of the 
precise control required of the two exponentially amplified or damped 
resonant modes that meet at the EP, which must also be decoupled from 
all other modes present in a system. 

Proposals to overcome this problem have meanwhile been put 
forward, such as to map the dynamical encircling of an EP to the 
polarization evolution in a stratified non-transparent medium", but 
the implementation requirements involved prevented an experimental 
realization for this case too. Here, we overcome such difficulties by 
demonstrating that waveguides with two transverse modes can be 
suitably engineered such that the transmission through them is 
equivalent to a slow dynamical encircling of an EP. In this way we make 
the recently discussed dynamical features of EPs directly accessible 
through established waveguide technology as used for the transmission 
of sound, light, microwaves and matter waves. 

An EP arises when an open system described by the Schrédinger- 
type equation i0,7) = Hw features two resonant modes that coalesce. 
Such a scenario can conveniently be captured by the following 
non-Hermitian 2 x 2 Hamiltonian: 


6—iy,/2 g 
g —iy2/2 


(1) 


where g denotes the coupling and 6 the detuning; 7, and 72 are the 
respective loss rates of the two relevant modes. At the specific param- 
eter configuration dgp =0 and g,, =|, — 7|/4, both the eigenvalues 
and eigenvectors of this Hamiltonian coalesce, which is the hallmark 
of the EP. As shown in Fig. 1, the vicinity of this point exhibits a 
characteristic structure of a self-intersecting Riemann surface. The EP 
marks the branch point (at the centre of each panel in Fig. 1a, b) at which 
the Riemann surface splits. It is this topological structure that allows one 
to encircle the EP such that the two eigenmodes interchange: for such a 
state-flip two system parameters need to be continuously changed in 
time t (for example, the coupling g=9(t) and the detuning 6= 6(f)) along 
a closed loop in parameter space around the EP. This system evolution 
is described by the now time-dependent Hamiltonian (1) in the 
corresponding Schrédinger-type equation i0,1)(t) = H(t) w(t). Ifthe 
system dynamics is fully adiabatic, a flip between the two states is 
realized upon encircling the EP such that the lower state becomes the 
upper one (Fig. 1a, left). As was found only recently’, however, contri- 
butions due to the breakdown of adiabaticity in non-Hermitian systems 
always enter dominantly whenever both encircling directions are 
considered. In the case above, traversing the same parameter loop in the 
opposite direction thus leads to the situation that the lower state returns 
to itself rather than to the upper state (Fig. 1b, left). This enforces 
an overall asymmetric behaviour such that the state that is selected at 
the end of a loop depends only on the loop’s encircling direction, 
but not on its starting point—compare Fig. 1a and Fig. 1b for a 
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Figure 1 | Mode evolution in the vicinity of an exceptional point. To 
demonstrate the non-adiabatic nature of dynamically encircling an EP 
degeneracy, we show trajectories with different encircling directions, 
starting on both of the Riemann sheets involved (shown as red and blue 
surfaces). The results for the state evolution of the Schrédinger-type 
equation i0,,¢)(x) = H(x)q)(x), projected onto their respective Riemann 
sheets, are shown as black lines: the larger the contribution of an 
eigenvector, the closer it follows the corresponding eigensheet”!. 

a, Dynamics of two states with starting points on different sheets 


counter-clockwise and clockwise encircling, respectively. On a very 
fundamental level, these features are connected with the Stokes 
phenomenon of asymptotics'®"* as well as with the theory of singular 
perturbations and stability loss delay”!—concepts that can be used to 
determine whether and when the non-adiabatic excursions shown in 
Fig. 1a, b occur, as well as to estimate their latest possible onset time?!. 

To observe this behaviour in a realistic environment, we now map 
the Hamiltonian in equation (1) onto the problem of microwave 
transmission through a smoothly deformed metallic waveguide in the 
presence of absorption (see Fig. 1c). The waveguide is extended along 
the x axis and we restrict the following discussion to a single transverse 
dimension y. Within this framework, the parametrical encircling of the 
EP from the 2 x 2 model shown above translates to a slow variation of 
a periodic boundary modulation along the waveguide. Directly at the 
EP, both the Bloch wavenumbers K and the Bloch modes A of the 
electric field distribution (x, y) = A(x, y)e coalesce. More 
specifically, the harmonic solutions (x, y, t) = o(x, y)e ™ for fields 
oscillating with frequency w obey the Helmholtz equation: 


Ag(xsy) + V(x%y)O(%s y) =0 (2) 


where A is the Laplace operator in two dimensions, V(x, y) = e(x, y)w?/c? 
is a complex potential proportional to the dielectric function ¢, and c 
is the speed of light. For a straight rectangular waveguide with a fixed 
width W in the y direction the solutions of equation (2) in the absence 
of losses are ¢,(x,y) =u,(y)e** with transverse mode functions 
Un(y) =sin(nny/W) and wavevectors k, = Jw?/c?—n?1x7/W?. By 
choosing an appropriate input frequency w, the transmission problem 
can naturally be reduced to only two propagating modes n = 1, 2. To 
implement a controlled coupling between these modes, we consider a 
waveguide subject to a boundary modulation €(x) = osin(kpx), as 
shown in Fig. lc. By choosing the boundary wavenumber 
kp =k, — k) +6, where|6| < ky, near-resonant scattering between the 
otherwise very different modes ¢ and @2 occurs. The full solution for 
the propagating field can be written in the form: 


(my) = a(x) O(% y) + a(x) O,(% y) (3) 


Employing a Floquet-Bloch ansatz, we obtain a Schrédinger- 
type equation for the slowly varying modal amplitudes 


ab(x) = (e1(x), co(x))? =e (.fiky a(x), «| —iky an(x))!: 


during a counter-clockwise loop around the EP (as seen from the top). 

b, Same as a for a clockwise loop. In both a and b the end points of the 
loops depend only on the encircling direction, not on their starting point. 
c, Schematic of an asymmetric mode-switch that projects the above 
EP-encircling to a waveguide that strongly attenuates one of its two 
transverse modes, depending on the injection direction. The parameter- 
space trajectories describing counter-clockwise and clockwise loops 
around the EP shown ina and b correspond to the left and right injection, 
respectively. 


j c(x) = 6(x) —iy,/2 
0/0) | (x) 


g(x) 
—iy2/2 


(4) 


c(x) | 


C2(x) 


(See Supplementary Information for a more detailed derivation verified 
by numerical simulations.) The slow variation of 6=6(x) and 
g=g(x) « o(x)in Hamiltonian (1) is then directly implemented in the 
waveguide through a smooth variation of the modulation potential 
V(x, y), which leaves the validity of equations (3) and (4) intact. Finally, 
owing to the even and odd symmetry of u(y) and u(y), an absorbing 
material placed close to the centre of the waveguide gives rise to losses 
>> 72. With the above, all parameters in the non-Hermitian 
Hamiltonian H in equation (1) are determined. However, instead of 
governing the temporal dynamics (in time), H determines here the 
mode propagation in the longitudinal direction x. Correspondingly, 
the requirement of encircling the EP slowly (in time f) is transferred 
here to a slow variation of the boundary parameters along the propa- 
gation direction x (see Fig. 1c). Quite remarkably, a right and left injec- 
tion into the waveguide corresponds to a clockwise and 
counter-clockwise encircling direction of the EP, respectively, yielding 
a specific and different output mode depending only on the side from 
which the waves are injected. 

First numerical results following this procedure are shown in Fig. 2, 
where we rely on a parametrization of the waveguide modulation 
envelope, a(x) = (09/2)(1 — cos(2%x/L)) that is restricted to a finite 
region (x € [0, L]) and perfectly connects to flat semi-infinite wave- 
guides outside this domain. Deviating from what is shown in Fig. 1, we 
also choose the detuning 6 to be linear in x, 6(x) = 69(2x/L—1) +p, 
which, together with a(x) from above, still describes a loop around 
the EP, since the endpoints of this parameter-trajectory correspond 
to identical waveguide configurations (see Supplementary 
Information for details). By implementing these design considera- 
tions in a waveguide first with uniform (bulk) loss in the transverse 
direction, the desired asymmetric switching of modes is, indeed, fully 
realized, as follows. Either mode entering from the left (Fig. 2a, b) is 
scattered into the first mode at the right exit lead. In contrast, any 
mode injected from the right side of the waveguide yields the second 
mode at the left exit lead (Fig. 2c, d). On the downside, however, 
the large overall loss both states have to acquire in order to manifest 
this asymmetry considerably deteriorates the quality of this 
switching mechanism. Additionally, the requirement of slow 
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L = 100W 


Figure 2 | Chiral transport in the presence of bulk absorption. 

a-d, Numerically simulated modal wavefunction intensities for a 
waveguide with a length-to-width ratio of L/W = 100 (the depicted 
dimensions are not to scale). Shown are results for different input modes 
and injection directions: arrows indicate the side from which the 
waveguide is excited; the first mode is injected in a and ¢, and the second 
mode is injected in b and d. We use a logarithmic scale for the respective 
intensities since the overall dissipation is very strong, as is evident from 


encircling translates into a long and bulky device with many bound- 
ary oscillations. 

To overcome both of these obstacles, we devised the following two 
strategies. First, we designed the absorption in the waveguide to follow 
a spatial pattern that minimizes (maximizes) the dissipation for the 
mode featuring the adiabatic (non-adiabatic) transition, while leav- 
ing the topology of the loop around the EP intact (see Supplementary 
Information for details). Remarkably, no matter which spatial profile 
we choose for the absorber, the reciprocity principle ensures that our 
design works for both transmission directions equivalently. Second, 
we employed a combination of quasi-Newton methods with stochastic 
algorithms to decrease the system length, resulting in a length-to-width 
ratio reduced by a factor of four as compared to the devices shown in 
Fig. 2. In this optimization, we tuned the parameters 09, 9 and p such 
as to reduce the waveguide length while making sure that the resulting 
device still maintains the frequency robustness inherent in our design 


the corresponding values for transmission T,,,, from mode n into mode m: 
Ty =2.5 x 107", Ty, =8.9 x 1077, Ty =7.0 x 1074 and Ty =8.4 x 107. 
The normalized mode profiles at the waveguide exit, which clearly show 
the efficient mode-switching, are shown in Supplementary Fig. 10. e, Plot 
of the absorption strength, which is gradually switched on and off, but is 
uniform in the transverse direction. Specifically, for the above waveguide, 
wW /cr = 2.05, o9/W = 0.07, doW = 0.5 and pW = —0.5. 


principle (see Supplementary Fig. 11 for this efficient device geometry 
and the corresponding numerical results). 

To demonstrate its potential for real-world applications, we provide 
here the first experimental realization of the above protocol, 
implemented in a surface-modulated microwave setup following 
the proposed efficient design (Fig. 3a, b). Measuring the modal 
transmission intensities T,,,, from mode n into mode m as a function 
of the input signal frequency, we unambiguously confirm the 
asymmetric switching effect (see Fig. 3c): An arbitrary combination 
of modes injected from the left side of the waveguide is transmitted 
into the first mode when arriving at the exit lead on the right (T), 
and T>, dominate the transmission of the first and second mode, 
respectively, with transmission intensity ratios T,,/T1,=20.6 and 
Ta / Tr = 23.0). At the same time, the second mode is produced on 
the left for injection from the right (T,, and T;, dominate the respective 
modal transmission, where the primed quantities are those for 


Figure 3 | Microwave measurements. a, Photograph of the optimized 
waveguide channel used in the experiment, with a surface-modulated 
region of length L = 1.25 m and width W=5 cm (image credit J.B. and 
U.K., 2015). Within this setup, input and output antennas are placed 

1.5 m apart (shown on the top plate). Black foam is used both as an 
absorber in the centre of the waveguide (magnified in b) and to mitigate 
the reflection into the entrance and exit leads. The setup is engineered for 
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a target frequency of v = 7.8 GHz (shown by a dashed vertical line in c), 

but the design ensures applicability over a broad frequency interval. 

c, Measured frequency-dependent transmission intensities T',,, (T,,,,) from 

mode n into mode m for injection from the left (right) are shown by solid 

(dashed) lines. The waveguide parameters here are wW/cn = 2.6, 

0o/W =0.16, d)W = 1.25 and pW= —1.8. 
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injection from the right, with ratios T,, /T/, = 463.4 © Ty/T,= 488.6 
and T;,/T>,= 425.9 © Ty /T\2 = 438.4). Note that the slight violation 
of the reciprocity property T/,, = Tin observed in the experiment 
(see Fig. 3c) is due to the magnetized absorber material (see details in 
Methods), which is needed to obtain a sufficiently strong absorption 
in the corresponding frequency range (without the absorber, the exper- 
iment is fully reciprocal). This small non-reciprocity is, however, not 
essential for the operation of our device, since the respective intensity 
ratios are approximately the same for both injection directions. Most 
importantly, the experimental data proves the very strong robustness 
of these transmission values with respect to variations of the input fre- 
quency—a broad-band feature that is a direct consequence of our 
design principle, which ensures operability also in the presence of small 
variations of the waveguide parameters. The shortened device for which 
the length-to-width ratio is now L/W =25 also vastly outperforms the 
longer device in Fig. 2 (for which L/W = 100), not only in terms of 
length-to-width ratio, but also in terms of the output intensity which 
is here increased by six orders of magnitude. 

As an ultimate proof that the functionality of our device hinges on 
a dynamical EP-encircling, we also fabricated five waveguides, with 
individual boundary frequencies and amplitudes distributed over 
the parameter loop around the EP inherent in the chirped waveguide 
design of Fig. 3. Concatenating these stroboscopic results allows us 
to eliminate the dynamics in the loop around the EP, resulting in a 
parametric EP-encircling for which all non-adiabatic contributions 
should vanish. Remarkably, our results for this case (see Supplementary 
Fig. 9) fully reproduce the symmetric state flips that were observed in 
all previous experiments''!-3 where such a parametric EP-encircling 
was implemented. 

In summary, our work constitutes the first experimental encircling 
of an exceptional point that stays faithful to the full dynamical and 
non-adiabatic behaviour occurring in this context. In this way, we have 
devised a notably platform-independent approach to mode switching 
that is implementable not just for microwaves, but readily applicable 
also to light, acoustic or matter waves. An accompanying paper also 
reports on a dynamical EP-encircling in an optomechanical setup". 


Online Content Methods, along with any additional Extended Data display items and 
Source Data, are available in the online version of the paper; references unique to 
these sections appear only in the online paper. 
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METHODS 


Numerical simulations. In our numerical simulations we solve the Helmholtz 
equation (2) on a finite-difference grid by means of a Green's function method”. 
The transmission (reflection) amplitudes tym (Tm) are then determined by 
projecting the system’s Green’s function onto the flux-carrying modes in the 
semi-infinite leads that are attached to the scattering geometry. The corresponding 
intensities are given by Tm = |tam|” and Rnm = |Tum|*, respectively. We choose the 
real part of the potential V(x, y) to be finite (infinite) inside (outside) the cavity, 
corresponding to Dirichlet boundary conditions, and the imaginary part of the 
potential is determined such as to satisfy the protocol described in the main text. 
Experimental setup. The experimental device is an aluminium waveguide with 
dimensions L x W x H=2.38m x 5cm x 8mm. Figure 3a shows the surface 
modulation that steers the modes around the EP. Our microwave experiment 
allows us to define the corresponding boundary conditions very accurately 
and to place the magnetized absorbing foam material (LS-10211 foam from 
ARC Technologies, W x H=2.5mm x 5mm) with sub-wavelength (<0.5 mm) 
precision. Additional absorbers (LS-14 and LS-16 foams from Emerson and 
Cuming, W x L=5cm x 17.5cm) are employed to mimic semi-infinite leads. 
Microwave measurements. To probe the sinusoidal modes formed by the 
z component of the electric field E, (ref. 33), we use two microwave antennas 
1.5m apart. The antennas are fixed onto motor-controlled, moveable slides and 
measure the complex transmission signal outside of the modulated surface area 
at 2 x 2 points along the y axis of the antennas. For the measurements we employ 
microwaves with a frequency around v= 7.8 GHz, which is well below the cutoff 
frequency for TE9 modes (v,= c/2H = 18.75 GHz), such that only the first two 
sinusoidal TE modes contribute to the transport. By applying the twofold sine 
transformation: 


tim = : > HOpy,)sin| Fy] sin 
v2 
where t'(y,, y,) denotes the normalized transmission measured between antenna 1 
at position y; and antenna 2 at position y2, we obtain the transmission matrix 
tnm in its mode representation. The normalization is necessary to overcome the 
frequency dependent coupling of the two antennas, and is given by: 


typ) 
JO = Mr P)G = irs) P) 


t! (Vp Vo) = 


Here, t(y,, y,) describes the transmission amplitude and (r‘) ((r3)) denotes the 
measured reflection amplitude at antenna 1 (antenna 2), averaged over all positions 
y1 (y2) and over a frequency window of 0.076 GHz. The measured reflection 
amplitudes are dominated by an imperfect impedance matching between the 
antenna and the channel. This results in a strong reflection signal originating from 
the antenna itself, which contains no information about the waveguide. We thus 
normalize the intensity fed into the system with the denominator in the above 
expression for t’(y,,y,), which allows us to compare the transmission in a broader 
frequency range. 
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Topological energy transfer in an optomechanical 
system with exceptional points 


H. Xu!, D. Mason!, Luyao Jiang! & J. G. E. Harris! 


Topological operations can achieve certain goals without requiring 
accurate control over local operational details; for example, 
they have been used to control geometric phases and have been 
proposed as a way of controlling the state of certain systems within 
their degenerate subspaces! *. More recently, it was predicted that 
topological operations can be used to transfer energy between 
normal modes, provided that the system possesses a specific 
type of degeneracy known as an exceptional point?"!!. Here we 
demonstrate the transfer of energy between two vibrational modes 
of a cryogenic optomechanical device using topological operations. 
We show that this transfer arises from the presence of an exceptional 
point in the spectrum of the device. We also show that this transfer 
is non-reciprocal!?"4, These results open up new directions in 
system control; they also open up the possibility of exploring other 
dynamical effects related to exceptional points'*'°, including the 
behaviour of thermal and quantum fluctuations in their vicinity. 

An externally imposed time variation of the Hamiltonian H of an 
otherwise isolated, conservative system provides a powerful means 
for controlling the evolution of the system. If H is varied sufficiently 
slowly, then the adiabatic theorem states that a system prepared at some 
initial time fp in a non-degenerate normal mode of H(to) will remain 
in the corresponding normal mode of the instantaneous H(t) (ref. 17). 
As a result, varying H so as to execute a closed loop (in the space of 
parameters that define H) will return the system to its initial state, up to 
an overall phase. This phase was shown by Berry and others to include 
a contribution that is determined by a simple geometric property of 
the control loop'*. The subsequent insight that such a topological 
operation (that is, executing a closed control path) may have an 
outcome that is robust against small fluctuations in the control path has 
had a profound impact on many areas of theory and experiment? *!*, 

More recently, it was predicted*"’ that topological operations may 
also be used to transfer energy between modes in systems that are 
subject to loss and/or gain. Specifically, energy transfer was predicted 
to occur for closed adiabatic control paths that enclose an exceptional 
point (EP, a form of degeneracy that can arise when the effective 
Hamiltonian is non-Hermitian; also known as a branch point). It 
was also predicted!?"4 that such operations can be non-reciprocal 
in their dependence on the initial conditions of the system and the 
direction of rotation of the control loop about the EP. The possibility 
of using topological operations to control the energy distribution 
within a system while also inducing non-reciprocal behaviour has 
attracted considerable attention’? **. Some features of EPs have been 
demonstrated in static measurements of spectra and eigenmodes” 4. 
however, experiments have not yet realized topological or non- 
reciprocal dynamics by encircling an EP. 

Here we measure topological and non-reciprocal dynamics in an 
optomechanical system. We show that the system possesses an EP 
and that external control parameters can be used to encircle the EP on 
timescales comparable to the lifetime of the excitations of the system. 
We demonstrate that such topological operations can transfer energy 
and that this energy transfer is non-reciprocal. When the control path 


is not adiabatic, the dynamics becomes more complicated; however, we 
find quantitative agreement between experimental data and numerical 
simulations over the full range of measurements. 

The system studied here consists of a silicon nitride membrane 
placed inside a high-finesse optical cavity**. The dimensions of the 
membrane are 1mm x 1mm x 50 nm. Because it is almost perfectly 
square, the vibrational eigenmodes of the membrane include nearly 
degenerate pairs that are well-separated in frequency from all the other 
eigenmodes. We use this separation to focus on a nearly degenerate 
pair with natural frequencies w\/(27) = 788.024 kHz and w,/(2T) 
= 788.487 kHz. In the absence of laser light driving the optical cavity, 
these two modes are essentially uncoupled and have very small 
damping rates (7;/(2%) = 0.6 Hz and 72/(2m) = 1.4Hz). 

When a laser excites the cavity, the resultant intracavity field 
a drives the vibrations of the membrane via radiation pressure. At the 
same time, these vibrations detune the cavity and thereby modulate 
a (refs 25, 26). It is straightforward to integrate a(t) out of the full 
optomechanical equations of motion (see Methods), resulting in an 
effective equation of motion for just c, and c, the displacements of the 
modes of the membrane: 


iC(t) =HC(t) (1) 
where C(t) =[c,(#), co(t)]". The effective Hamiltonian is 
Wy a . igio 
H=|  ? (2) 
—1g,8,0 


where gj are the optomechanical coupling rates of the mechanical 
modes, and the complex mechanical susceptibility introduced by the 
intracavity field is 


e200) mm | tg 
AQ, (K/2)P? + A?|K/2-i(wot+ A) K/2+i(-wot A) 

Here P and {2 are the power and frequency of the laser driving the 
cavity, A is the mean detuning between the laser and the cavity, 
wo = (W1 + w2)/2, and «& and kip are the linewidth and input coupling 
rate of the cavity, respectively. The experiment described here is 
classical; the reduced Planck constant h appears in the expression for 
a because g),7 are given in terms of the single-photon rate. 

The system will possess an EP if o can be made to equal 
(w— inn/2— w2+ i72/2)|—-i(@7 — 83) +2¢,8,]/(g} +83) - Achieving 
this typically requires control over both Re(c) and Im(c). For 
optomechanical devices in the resolved sideband regime (& < wo), this 
control is provided by P and A. By contrast, when K >> wo, Pand A 
appear in o ina linearly dependent fashion and so control only |a|. The 
ability to access (and encircle) an EP using the detuning and power of 
a single laser is an important feature of the system presented here (and 
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Figure 1 | The complex eigenvalues of the normal modes of the 
membrane. The resonance frequency (horizontal axis) and damping rate 
(vertical axis) of the two mechanical modes of the membrane as a function 
of the laser power P and detuning A. Data for one mode are shown as 
squares; data for the other mode are shown as circles. The statistical 
uncertainty in the measurements is smaller than the symbols. Colours 
indicate P, while the arrows indicate the variation of the eigenvalues as 

A is varied from —1,200 kHz to —400 kHz at fixed P. For the lower values 
of P, each eigenvalue follows a closed trajectory, beginning and ending 

at the same point. For the higher values of P, the eigenvalues follow 

open trajectories, each one ending at the starting point of the other. The 
solid lines are the global fit described in the text. The location of the EP 
predicted by this fit is shown as a black cross. 


in contrast with the more complicated arrangement proposed in 
ref. 27), because these parameters can be controlled in situ with a high 
degree of precision, timing accuracy, and dynamic range. 

A detailed description of the optomechanical device and the 
measurement set-up is given in Methods. The membrane and optical 
cavity are maintained at T=4.2 K. The motion of the membrane is 
monitored via a heterodyne measurement of a laser with constant 
power and detuning. Control over the optomechanical system is 
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provided by a separate laser, whose detuning A and power P are set by 
an acousto-optic modulator. 

To establish the presence of an EP in this system, we measured the 
mechanical spectrum of the membrane as a function of A and P. These 
spectra were acquired by driving the membrane and monitoring its 
response via the heterodyne signal. As described in Methods, each 
spectrum was fitted to determine the two resonance frequencies 
wa,p(A, P) and damping rates 7.,(A, P). (The subscripts ‘@ and ‘b’ refer 
to the normal modes of the membrane in the presence of an optical 
field; the subscripts ‘1’ and ‘2’ used previously refer to these modes in 
the absence of an optical field.) 

The results of these fits are summarized in Fig. 1, which shows the 
complex eigenvalues £45 = Wab — i7ab/2 as A and P are varied. When 
P<155.W, & and &; each trace out a closed trajectory, completing a 
loop as A is varied from <— wy, to >> — wp. By contrast, when 
P>265\W, & and & both follow open trajectories, swapping their 
values as A is varied over the same range. This sharp transition in the 
topology of (A) is characteristic of an EP®. The solid lines in Fig. 1 
are a global fit to the complex eigenvalues of H, which gives best-fit 
values of w 1,2 and 71,2 as stated above, as well as g)/(2%) = 1.03 Hz, 
g2/ (27) = 1.14 Hz, kin/(2T7) = 70 kHz and «&/(27) = 177 kHz. These 
values imply the existence of an EP at Agp/(27) = —792.5 kHz, 
Pep = 223 pW (or equivalently wep/(27) = 788.2 kHz and ygp/(277) 
= 460 Hz, indicated as the black cross in Fig. 1). 

Figure 2a, b shows measurements of Re(é,,,) and —2Im(,p) over 
a narrow range of A and P centred on Agp and Pgp. These measure- 
ments show the characteristic features of an EP: €, and &, coalesce 
at a single value of the control parameters and, in the vicinity of this 
point, they exhibit the same structure as the Riemann sheets of the 
complex square-root function z!’”. For comparison, Fig. 2c, d shows 
the eigenvalues of H (see equation (2)), calculated using the best-fit 
values determined in Fig. 1. 

The surfaces shown in Fig. 2a, b are such that if A and P were varied 
to execute a single closed loop, the resulting smooth evolution on the 
eigenvalue manifold would return to its starting point only if the loop 
did not enclose the EP. By contrast, a loop enclosing the EP would result 
in a trajectory starting on one sheet, but ending on the other. 


Figure 2 | The exceptional point in the 
spectrum of mechanical modes. a, b, The 
resonance frequencies (a) and damping rates 

(b) of the two mechanical modes of the 
membrane as a function of laser power P and 
detuning A. Each grid point corresponds to a 
measurement; grid lines and surface colouring 
are guides to the eye. Colouring is chosen so that 
red (blue) corresponds to the mode with lower 
(higher) damping. c, d, Plots of the theoretically 
calculated real (c) and imaginary (d) parts of the 
eigenvalues of the effective Hamiltonian matrix H 
(equation (2)). All of the parameters appearing in 
this calculation are taken from the fit in 

Fig. 1. Note that the viewing angle in a and c 
differs from that in b and d. 
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Figure 3 | Topological energy transfer. a, b, The energies of mode ‘a 

(red) and mode ‘b (blue) as a function of time ¢. A drive is applied to 

the ‘@ mode for t< 0. At t=0 the drive is turned off and the control loop 
described in the text is implemented. The control loop ends at t= 16 ms; 
the grey shaded region corresponds to the time during which the 

control loop is implemented. For t > 16 ms the system relaxes to thermal 
equilibrium. The black lines are fits to a decaying exponential (due to the 
mechanical damping) with a constant offset (reflecting the thermal motion 
of the mode). The black dot shows the extrapolation of this fit to t= 16 ms. 
The loop used in a does not enclose the EP, whereas the loop used in 

b does. c, The fraction of the (remaining) energy in the ‘b’ mode after the 
control loop has been completed as a function of the maximum detuning 
of the loop, Amax. The left (right) point shown as a solid circle corresponds 
to the data in a (b). d, The corresponding measurement as a function of 
the maximum power of the loop, Pmax. In ¢ and d, the statistical errors are 
comparable to or smaller than the size of the symbols. The solid lines are 
numerical simulations of the dynamics and are completely constrained by 
the parameters from the fit in Fig. 1. The insets are schematics showing 
how the loop varies along the horizontal axis of each panel; the location 

of the EP is indicated by the black cross. 


To observe this effect, we performed a series of measurements in 
which A and P were initially set to A,,ax and Pmin, and one of the modes 
of the membrane (c,) was excited using a piezoelectric element. Once 
the system reached its steady state, the piezo drive was switched off, 
and A and P were varied to sweep out a closed rectangular loop. The 
loop was defined by the points (Aynaxs Pmin)s (Amaxs Pmax)> (Amins Pmax) 
and (Annins Pmin), returning to (Ajax, Pmin) after a duration T= 16 ms. 
This value of 7 was chosen so that nearly all such control loops satisfy 
the requirement of conventional adiabaticity: 7 >> 1/|€, — €,| (loops 
passing close to the EP do not satisfy this inequality). We describe the 
effect of varying 7 below. 

The heterodyne signal was recorded before, during and after 
the control loop. This signal was demodulated at frequencies 
wWa(Amax» Pmin) and wp(Amax» Pmin), with typical results shown in 
Fig. 3a, b. Before and after the control loop (that is, for t<0 and t>7), 
this record corresponds to the amplitudes of the motion of the normal 
modes |c,(t)| (red in Fig. 3a, b) and |c,(t)| (blue). During the control 
loop (0 <t<7) this correspondence does not hold, because the eigen- 
frequencies of the membrane undergo rapid variations; data from this 
region do not play any role in our analysis. As shown in Fig. 3a, b, c, is 
initially excited to about 4 x 10~'?m. There is also a small excitation of c, 
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(owing to the non-zero overlap of the mechanical resonances); however, 
this unintentional excitation accounts for less than about 1% of the total 
energy, and does not qualitatively affect the results presented here. 

Comparing |c,,(0)| with |c.,(7)| in Fig. 3a, b, it is clear that energy is 
lost from the system during the control loop. This reflects the fact that 
the damping here is always positive. To distinguish this overall energy 
loss from effects related to the topological operation, we focus on the 
relative energy of the two modes before and after the loop. 

The data in Fig. 3a were taken for a control loop that did not enclose 
the EP (Amax = — 1,440 kHz, Pmax = 750 .W; for all data, Amin = 
—1,890 kHz, Pmin=2 UW). Asa result, the nearly adiabatic transit 
around the control loop results in negligible energy transfer at the end 
of the control loop. This can be seen qualitatively in Fig. 3a by noting 
that approximately 99% of the energy is in c, both immediately before 
and immediately after the control loop. 

By contrast, Fig. 3b shows a measurement in which the control loop 
does enclose the EP (Ajax = —300 kHz, Pmax = 750 1W). The effect on 
the dynamics is readily visible: before the loop more than 99% of the 
energy is in c,, whereas after the loop more than 99% of the (remaining) 
energy is in Cp. 

To quantify the transfer of energy from one mode to another, we 
define the efficiency E = |c,(7)|?/[|ca(7) |? + |c.(7)|"] (this definition 
makes use of the fact that, before the loop, nearly all the energy is in c,). 
The values of |c,,,(7)| are determined by fitting decaying exponentials 
to |cap(t)| for t>7-+ 20 ms and extrapolating these fits to t=. 

Figure 3c shows E(Aynax) for fixed Prax = 750 WW; Fig. 3d shows 
E(Pmax) for fixed Ajax = —290 kHz. The limiting behaviour in both 
cases (that is, for large or small Pax and Amax) agrees with the 
prediction that adiabatic paths enclosing the EP will result in energy 
transfer, whereas adiabatic paths not enclosing the EP will not. The 
solid lines in Fig. 3c, d are the results of numerically integrating 
equations (1) and (2), and are not fits; rather, they use the P(t) and 
A(t) used in the measurements, and the values of gi,2, W1,2, 71,25 Kin and 
« determined from the data in Fig. 1. These simulations show good 
agreement with the measurements irrespective of whether the loop 
encloses the EP and of whether the loop satisfies adiabaticity. 

The measurements shown in Fig. 3 were all made by applying the 
initial drive to the ‘a mode and then executing a control loop in the 
counter-clockwise sense. In this case, the adiabatic trajectories enclosing 
the EP correspond to the less-damped eigenmode (red regions of the 
surfaces in Fig. 2) for the majority of the loop. By contrast, executing the 
same loop in the clockwise sense would result in an adiabatic trajectory 
corresponding primarily to the more-damped eigenmode (blue regions 
in Fig. 2). As described in refs 12-14, 28, adiabatic behaviour is expected 
while the system is in the less-damped eigenmode; however, when the 
system is in the more-damped mode, competition between the non- 
adiabatic transfer (which is exponentially small in 7) and the effect of 
differential loss (which is exponentially large in 7) leads to a breakdown 
of adiabaticity, causing the system to eventually relax to the less-damped 
mode. This process may also be understood as a consequence of the 
Stokes phenomenon of asymptotics”. 

This behaviour is demonstrated in Fig. 4, which shows E(7) when the 
EP is encircled in the counter-clockwise or clockwise sense, and with 
the initial excitation in the ‘a mode (for which E is as defined above) 
or the ‘b mode (for which E is as defined above, but with the subscripts 
reversed). The same loop was used in all four cases: Amin = — 1,890 kHz, 
Pmin= 2 WW, Amax = —290kHz and Pmax = 750 .W. In all four cases, 
executing the loop very quickly results in negligible energy transfer 
(E— 0 as tT 0), consistent with the conventional expectation for a 
sudden perturbation. 

The adiabatic limit (7 > 1 ms) is quite different. Efficient energy 
transfer is achieved (E — 1) for an initial excitation in the ‘a mode and 
a counter-clockwise loop (and for an initial excitation in the ‘b’ mode 
and a clockwise loop), consistent with the discussion of Fig. 3, and with 
the fact that these conditions correspond to adiabatic paths almost 
entirely in the less-damped mode. By contrast, E—-0 when 7 > 1 ms 
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Figure 4 | Non-reciprocal topological dynamics. 
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a, b, The transfer efficiency E as a function of the 
duration of the control loop, 7. The loop shape is 
identical for all four data series and encloses the EP. 
The loop is counter-clockwise in a and clockwise 
in b, as indicated by the arrows. Red (blue) circles 
represent data for which the ‘@ (‘b’) mode is initially 
driven. In all four cases, rapid encircling around 
the EP (rT 0) results in vanishing energy transfer 
(E-— 0). For adiabatic encircling, the limiting 
behaviour of E depends on the sense of the loop 
and which mode is initially excited. For counter- 
clockwise (clockwise) loops, the red (blue) data 
correspond to conventional adiabaticity (E— 1 as t 
increases) and the blue (red) data show the opposite 
behaviour (E— 0 as 7 increases). As described 

in the text, this reflects the non-reciprocity of 
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for an initial excitation in the ‘b’ mode and a counter-clockwise loop 
(and for an initial excitation in the ‘@ mode and a clockwise loop). 

The behaviour described above may be summarized by describing 
an adiabatic control loop around an EP as a matrix that transforms the 
initial state C(0) =[c,(0), c2(0)]" to the final state C(t) =[c;(7), ex(7)]" 
with the form: 


(4) 


where ©) and @ denote a counter-clockwise and clockwise loop, 
respectively. Because H is a symmetric matrix, it is straightforward to 
show that if U(r) and Uz (7) represent identical but time-reversed 
control loops, then Us = Us. Along with this relationship, the four data- 
sets in Fig. 4 demonstrate the non-reciprocity of these operations, that 
is, that bg,2(7) # ¢5,29(7) for 7 > 1 ms (ref. 29). This inequality is also 
evident in direct measurements of |bo,e(7) and |c¢,(7) (see Methods). 

We have demonstrated a new form of adiabatic topological operation 
that allows for non-reciprocal energy transfer between two eigenmodes 
of a mechanical system. This transfer exploits the presence of an EP 
in the spectrum of the two modes. The square membrane used here 
also offers threefold and fourfold near-degeneracies, opening up the 
possibility of studying dynamics in the vicinity of higher-order EPs’>"®. 
Furthermore, the cryogenic optomechanical device used here is subject 
to both thermal and quantum fluctuations*’; it is an open question 
whether non-reciprocal topological effects will allow for new forms of 
control over these fluctuations. 


Online Content Methods, along with any additional Extended Data display items and 
Source Data, are available in the online version of the paper; references unique to 
these sections appear only in the online paper. 
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METHODS 


Measurement set-up. A schematic illustration of the experiment is shown in 
Extended Data Fig. 1. The optomechanical device and much of the measurement 
set-up are described in ref. 30. The membrane and optical cavity are mounted in 
a cryostat that is maintained at T= 4.2 K. The motion of the membrane is 
monitored via a heterodyne measurement using a probe beam and a local oscillator, 
both produced from a single laser (“ML in Extended Data Fig. 1a). The probe-beam 
frequency is shifted by an acousto-optical modulator (AOM1 in Extended Data 
Fig. 1a) driven at 80 MHz. Pound-Drever-Hall locking is used to keep the probe 
beam nearly resonant with one mode of the cavity; as a result its detuning Ap < k, 
resulting in a negligible contribution to o. Likewise, the large detuning of the local 
oscillator (Ajo + 80 MHz > & ) also results in a negligible contribution to o. 
Control over the optomechanical system is provided by a separate laser (‘CL in 
Extended Data Fig. 1a), whose detuning A and power P are controlled by an 
additional acousto-optic modulator (AOM3 in Extended Data Fig. 1a). The 
frequencies of the various beams are illustrated in Extended Data Fig. 1b. The 
cavity is approximately single-sided and all measurements are performed in 
reflection. The reflected beams are incident on a single photodiode, and 
demodulation circuits are used to monitor multiple Fourier components of the 
heterodyne signal, each with a bandwidth equal to 50 Hz. 
Optically mediated mechanical coupling. We consider a system consisting of two 
mechanical modes, each coupled linearly to a common optical mode. We show 
that the optical field generates a tunable effective coupling between the mechanical 
modes, which can be exploited to produce an EP, as described in the main text. The 
model closely follows the one presented in ref. 31. 

Ina standard optomechanical system, one considers an optical cavity mode with 
a frequency that is linearly coupled to the position of a mechanical oscillator. An 
input-output approach to this system yields a pair of coupled differential equations 
for the two modes, which can be easily treated in the Fourier domain to understand 
the optical modification of the mechanical susceptibility. Here we consider a simple 
extension of this model in which there are two mechanical modes, each coupled 
to the same optical mode. This yields the following equations of motion for the 
mechanical/optical modes: 


i KO, ; : 
a [5 + iwe]a ig,az1 — 1g,4Z2 + [Kin Gin 


q [2 + iwi} iga*a + Wal 


(oy [2 + iwa}e ig,a*a + S72 
where a is the optical mode amplitude with resonant frequency u,, total dissipation 
rate « and input coupling rate Kin. The ith mechanical mode is described by 
position z;= c; + c*, where c; is the complex mode amplitude and the asterisks 
indicate complex conjugation. Each mechanical mode has resonant frequency wj, 
dissipation rate 7;, and is coupled to the optical mode with a single-photon coupling 
rate g;. The optical and mechanical modes are driven by input fields aj, and 7, 
respectively. 

We now suppose that the cavity is driven by a beam with power P and frequency 
§2,, detuned from the cavity resonance by A= {2, — w.. By doing so, we can express 
the optical field as fluctuations d(t) around a mean intracavity field given by 


Kin Ain: ai, = P 
K 5 in> it, 
a iA 


a= 


Making these substitutions in the original system of equations yields the linearized 
equations of motion: 
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where we have defined a; = ag,. Moving to the Fourier domain, and defining the 
cavity susceptibility y.(w) = [«/2 — i(w+ A)] “1, we solve for d(w) and d'(w) and 
substitute these into the equations for c),2(w) to find a reduced system of two 
equations describing the mechanical modes: 
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Note that we have dropped counter-rotating c# and c3 terms. We have also dropped 
the mechanical drive terms 77,2. These are not necessary for our model, because 
we drive the system to a particular initial state, turn off the drive and then focus 
on the evolution of the system without any mechanical drive applied. 

In the traditional optomechanical system, one defines the (single-mode) 
optomechanical self-energy as Ysqm(w) = i Ja)? [x2(—w) — x,(w)]. In this two- 
mode system, we can extend this concept to a self-energy matrix: 
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where o is defined in equation (3). 
Writing our mechanical modes as a vector C(t) =[c,(#), c(t], we can write the 
following matrix equation: 


Nw 0 
2 


—iwC(w) = — C(w) — i(w)C(w) 


2 


Before we move back to the time domain, we note that »’(w) varies on the scale of 
k, whereas the mechanical modes are susceptible to drives only within their 
linewidth, which is substantially smaller than «, by assumption. Therefore, it is 
sufficient to consider 3)(w) + 3)(w 1) & 37(w2) =’. (The mechanical modes are also 
assumed to be nearly degenerate.) Now that 5’ is not a function of w, we can easily 
move back to the time domain to obtain equation (1) (reprinted here for 
convenience): 


iC(t) = HC(t) 


where we define 


+ (6) 
0 wr iD 


Here »’ is a complex quantity, which depends (via a; and a2) on Pand A. This is 
the tunability that allows us to access an EP in the spectrum of the two mechanical 
modes. 

We note that equation (6) is identical to equation (2); the apparent difference is 

due to the fact that in equation (2) the matrix 2’ is expressed using the right-most 
form in equation (5). 
Measuring the mechanical eigenvalue spectrum. In Figs 1 and 2, we show the 
presence of an EP in the complex eigenvalue spectrum (frequencies and decay rates) 
of the mechanical modes. At each point (P, A), the eigenvalues were measured by 
optically driving the mechanical modes and measuring their driven response. We 
measure the mechanical sidebands using the heterodyne measurement laser, locked 
to the cavity resonance. We set a certain P and A for the control laser, then apply 
amplitude modulation at a frequency near w and w», thus creating an optical beat 
note that drives the mechanical modes. This modulation frequency is swept over 
w, and w, and we use a lock-in amplifier to measure the complex response of the 
heterodyne signal to this drive. 

Two examples of these measurements are shown here. Extended Data Fig. 2 
shows a sweep over the two modes when the control-beam power is low and 
there is minimal hybridization of the two modes. In Extended Data Fig. 3, the 
control-beam power is large and detuned near —w,2 such that the modes hybridize 
substantially, resulting in modes with nearly degenerate frequencies, but different 
linewidths. The relative phase of the driven response of the two modes is such that 
we see destructive interference in Extended Data Fig. 3. By fitting the complex 
response to a sum of complex Lorentzians with an arbitrary phase offset, we extract 
WwW, W2, Y and 7. The solid lines in Extended Data Figs 2 and 3 are these fits, from 
which we extract the eigenvalues plotted in Figs 1 and 2. 
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EPs have also been observed in many other systems, including atom-cavity 

composites**, microwave cavities”*4, optical systems*?~°, electronic circuits*” 
and an exciton-polariton system**, and are predicted to exist in Bose-Einstein 
condensates”, quantum dots*!, acoustic systems’, magnetohydrodynamic 
dynamos* and nuclei. 
Measurement of propagator matrix elements. Figure 4 shows the non-reciprocity 
of the topological operations as parameterized by their energy transfer efficiency 
E. The non-reciprocity of these operations can also be seen from direct measure- 
ments of the magnitudes of the matrix elements defined in equation (4). These 
measurements are carried out by, for example, initially driving the ‘a mode and 
then performing a clockwise loop about the EP; in this case|aq (7) | = |ca(7) /ca(0)| 
and |cx(7)| = |cb(7)/ca(0)|. Similarly, repeating this process, but with the ‘b 
mode initially driven, gives |by (7)| and |dq(7)|. In Extended Data Fig. 4, we plot 
the magnitudes of these propagator matrix elements as a function of the loop 
duration 7. The points in Extended Data Fig. 4 are extracted from the same data 
as shown in Fig. 4. For sufficiently large 7, we see that |bg,o(T)| + |€g,0(T)| as 
stated in the main text, which implies Ug,¢(T) * U6,0(7): 

The real-time dynamics studied here can be connected to the propagation of 
light through an optical crystal with properties that vary along the beam path’. 
An encircling around an EP is also mapped onto the propagation through a two- 
mode waveguide in a concurrent experiment”®. 
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Extended Data Figure 1 | Experimental schematics. a, Illustration of 
the optical and electronic components. The measurement laser (“ML) is 
split into a local oscillator (“LO’ in b) and a probe beam (‘Probe in b). The 
probe-beam frequency is shifted by an acousto-optic modulator (AOML1), 
and is locked to the cavity using a Pound-Drever—Hall (PDH) scheme 
and modulation produced by an electro-optic modulator (“EOM’). The 
control laser (‘CL; ‘Control’ in b) is locked to the measurement laser with 
a frequency offset that is approximately double the free spectral range of 
the cavity. The control parameters used to access the EP are the power 
Pand detuning A of the control laser. P and A are set by the amplitude 
and frequency of a signal generator (‘SG’), which drives another acousto- 
optic modulator (‘AOM3’). The PDH error signal is used to control the 
frequency of yet another acousto-optic modulator (‘AOM2’), ensuring 
that all beams track fluctuations of the cavity. Light is delivered to 

(and collected from) the cryostat via an optical circulator. Coloured lines, 
hollow lines and thick black lines show free-space laser beams, optical 
fibres and electrical circuits, respectively. Triangles, ovals and semicircles 
show electronics, fibre couplers and photodiodes, respectively. ‘DAQ’ 
indicates the data acquisition system. The silicon nitride membrane is 
shown in purple. b, Illustration of the optical frequency domain. Lasers are 
indicated by coloured arrows and cavity modes by black curves. 
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Extended Data Figure 2 | Lock-in signal at low laser power (A = —780 kHz, P= 73 .W). Left, amplitude (top, red) and phase angle (bottom, blue) 
of the lock-in signal as a function of drive frequency. Right, the same data shown as a parametric plot of the in-phase and out-of-phase components 
of the lock-in signal as a function of drive frequency. 
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Extended Data Figure 3 | Lock-in signal at high laser power (A = —780 kHz, P= 380 ,W). Left, amplitude (top, red) and phase angle (bottom, blue) 
of the lock-in signal as a function of drive frequency. Right, the same data shown as a parametric plot of the in-phase and out-of-phase components 
of the lock-in signal as a function of drive frequency. 
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Extended Data Figure 4 | Magnitudes of propagator matrix elements. 
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Ablation-cooled material removal with ultrafast 


bursts of pulses 


Can Kerse', Hamit Kalaycioglu’, Parviz Elahi’, Barbaros Cetin’, Denizhan K. Kesim!, Onder Akcaalan', Seydi Yavas*, 
Mehmet D. Asik°, Biilent Oktem®, Heinar Hoogland”®, Ronald Holzwarth’ & Fatih Omer Ilday!* 


The use of femtosecond laser pulses allows precise and thermal- 
damage-free removal of material (ablation) with wide-ranging 
scientific!>, medical"! and industrial applications!?. However, 
its potential is limited by the low speeds at which material can 
be removed!?-'"!3 and the complexity of the associated laser 
technology. The complexity of the laser design arises from the 
need to overcome the high pulse energy threshold for efficient 
ablation. However, the use of more powerful lasers to increase 
the ablation rate results in unwanted effects such as shielding, 
saturation and collateral damage from heat accumulation at higher 
laser powers®!*-14, Here we circumvent this limitation by exploiting 
ablation cooling, in analogy to a technique routinely used in 
aerospace engineering'”!*®. We apply ultrafast successions (bursts) 
of laser pulses to ablate the target material before the residual heat 
deposited by previous pulses diffuses away from the processing 
region. Proof-of-principle experiments on various substrates 
demonstrate that extremely high repetition rates, which make 
ablation cooling possible, reduce the laser pulse energies needed 
for ablation and increase the efficiency of the removal process by 
an order of magnitude over previously used laser parameters!”"!®, 
We also demonstrate the removal of brain tissue at two cubic 
millimetres per minute and dentine at three cubic millimetres per 
minute without any thermal damage to the bulk”. 

Ablation is the evaporative removal of a material when its temperature 
exceeds a critical value. Because the ablated material is physically carried 
away, the thermal energy contained in the ablated mass is also removed, 
thus reducing the average temperature of the remaining material. 
This effect forms the basis of ablation cooling, which has been 
routinely used as an approach to thermal protection during the atmos- 
pheric re-entry of rockets since the 1950s, owing to the minimal mass 
requirements). Unlike ablation cooling for rockets, laser ablation is not 
continuous, but takes place only during and shortly after an incident 
laser pulse. For the laser parameters used in previous experiments abla- 
tion cooling has been negligible as a cooling mechanism in comparison 
with heat conduction (diffusion) from the processing region into the 
bulk of the target, which is continuously occurring. For ablation cooling 
to become a major contributor, the time delay between the laser pulses 
(the inverse of the repetition rate) must be reduced until the part of the 
material that is to be ablated does not cool substantially between suc- 
cessive pulses. Only then would heat extraction due to ablation become 
comparable to that due to diffusion (Fig. 1a). 

The physics of the ablation-cooled regime can be explained 
through a toy model (see Supplementary Information section 1 for 
full details). We assume that each pulse gives rise to an instantaneous 
temperature rise of AT, which is roughly proportional to the pulse 


energy, Ep, and that the material cools with al/./1+t / 7 dependence 
on the time delay, ¢, after the arrival of a pulse. The thermal relaxation 


time, To, is proportional to &/a, where 6 is the depth or the lateral 
radius (whichever dimension is smaller) of the section of the material 
to be ablated and a is its thermal diffusivity. For a train of N pulses, 
the temperature of the target surface that is encountered by the 
(n+ 1)th pulse is given by T,,,; = T,, + 8T, where T= AT /,/1+ 72/7 
is the small net increase in target temperature by a single pulse 
and Tp is inverse of the repetition rate. Ablation occurs when the 
temperature exceeds a critical value T.. For the traditional regime of 
ultrafast ablation, the repetition rate is low (>>) and each 
pulse must be energetic enough to cause ablation (AT > T-— To, 
where T) is the initial surface temperature). The ablation-cooled 
regime corresponds to Tp <7». In this regime, the energy of the 
individual pulses can be lower than the ablation threshold because 
temperature builds up from pulse to pulse and ablation starts after 
the mth pulse in the train, where m= (T-— Ty — AT + 6T)/8T. 
The volume of the ablated material is given by Vablatea = GIN — 
u(T:— Tp — AT)m]E,u(N—m), where (is a proportionality factor and 
u is the Heaviside (unit step) function. The thermal energy that diffuses 
into the bulk of the target owing to cooling between the pulses is 


ik 
Jl+Tr/To 


the traditional regime, this result reducesto lim Eheat = a(T:— To)NEp. 


TRO 
The toy model makes two main predictions for the ablation-cooled 
regime—both are confirmed by numerical solutions of the heat dif- 
fusion equation (see Supplementary Information section 2 for details) 
as well as the experiments described below. The first is that increasing 
the repetition rate reduces the heating of surrounding regions 
(Fig. 1b, c and Supplementary Fig. 1). Because less of the deposited 


Ebeat = A(T. m)(1 Jos m)E, + a(AT — &T )mE,, For 


laser energy is lost to heat diffusion (lim Eneat = 0), the ablation effi- 


TR-0 

ciency is higher than for the traditional regime (Supplementary Fig. 3). 
The second states that the pulse energy can be decreased if the num- 
ber of pulses is simultaneously increased in proportion, without a 
subsequent reduction in the ablation efficiency (Fig. 1d). This is nec- 
essary to fully benefit from the ablation-cooled regime, because 
shielding effects (that is, ablation-induced plasma and ejected par- 
ticulates reflecting and scattering incoming light) will prevent effi- 
cient ablation if the repetition rate is increased at a constant energy™®. 

To demonstrate ablation cooling, a customized femtosecond fibre 
laser!®-?! was used (see Supplementary Information section 3 for 
details). We implemented burst-mode operation’, because continuous 
trains of energetic pulses at the high repetition rates required to access 
the ablation-cooled regime correspond to a prohibitively high average 
power and laser repositioning in continuous mode is limited. In burst 
mode, the laser produces groups of high-repetition-rate pulses, which 
are, in turn, repeated with a lower frequency. The duty cycle of the 
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Figure 1 | Principles of ablation-cooled removal of a material by laser. 

a, Schematic representation of the ablation process for low (traditional 
regime, left diagrams) and high (ablation-cooled regime, right diagrams) 
repetition rates. Temperature profiles are illustrated for t= 7 (i), which 

is shortly after the arrival of the first pulse for both cases; for t= 7» (ii), 
which is before (shortly after) the arrival of the second (last) pulse for the 
low-repetition-rate (high-repetition-rate) laser; and for t= 7; (iii), which 
is shortly after the arrival of the last pulse for the low-repetition-rate laser. 
The colouration of the target material is based on simulation results shown 
in b at the indicated time intervals of 7), 72 and 73. b, Calculated evolution 
of the temperatures at the surface (solid lines) and below (at a depth of 

30 times the optical penetration depth) the surface (dotted lines) for 
repetition rates of 10 MHz (blue lines) and 1,600 MHz (black lines). 

The pulse energies and number of pulses are the same for both cases. 

The higher repetition rate results in substantially lower temperatures below 


Time (s) 


the surface due to ablation cooling. c, Expanded view of the shaded section 
of the plot in b. d, Calculated evolution of the surface temperature (dashed 
lines) and amount of ablated material (solid lines) for repetition rates of 
100 MHz (green lines), 400 MHz (blue lines) and 1,600 MHz (red lines). 
The ablation rate remains approximately the same when the product of the 
pulse energy and repetition rate is maintained. The spikes in the surface 
temperatures precisely indicate the arrival of pulses, which are not shown 
explicitly for clarity. e, Experimental set-up for direct confirmation of 

the ablation-cooling effect. f, The measured temperature increase that is 
induced on thermoelectric module 1 (the target material; solid lines) and 
thermoelectric module 2 (attached to the coverslip that collects a portion 
of the ablated particles; dashed lines, values have been multiplied by three 
to aid comparison with A Trager) with the laser operating in the ablation- 
cooled regime (blue lines) and in the traditional regime (red lines). 
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pulsation can be adjusted to set the average power. Burst-mode material 
processing has substantial benefits’”-~“, but the possibility of ablation 
cooling has not yet been recognized. 

First we present experimental evidence of the ablation-cooling 
effect by simultaneously measuring the temperature of a target mate- 
rial directly and the heat carried by the ablated particles (indirectly) 
(Fig. le). The laser beam is focused onto and ablates the surface of a 
thermoelectric module. This causes a temperature difference between 
the laser-targeted top surface and the bottom surface, which generates 
a voltage difference by the Seebeck effect. A portion of the particles 
ejected from the surface during ablation stick to a glass coverslip, which 
is held approximately 1 mm above the target. A second thermoelectric 
module is used to monitor the temperature of the coverslip, which rises 
in proportion to the thermal energy delivered by the ablated particles. 
The measured temperatures of the target and the coverslip (Fig. 1f) 
confirm that the target heats less, and the coverslip more, in the abla- 
tion-cooled regime. The laser parameters were 50 pulses of 3 ,1J each, 
with an 800 fs duration for a 0.2 MHz burst and a 1.7 GHz intraburst 
repetition rate. This is within the ablation-cooled regime assuming a 
typical thermal diffusivity of about 150mm/?s~' for the ceramic sur- 
face of the thermoelectric module and 3 J, 800 fs pulses at a 10 MHz 
uniform repetition rate to illustrate the traditional regime (10 MHz 
was chosen to be safely outside the ablation-cooled regime, although 
the thermal diffusivity of the ceramic surface is not precisely known). 

We demonstrate validity of the predictions of the toy model for abla- 
tion cooling across a range of materials (see Supplementary Information 
for a discussion of other materials). Copper and silicon were chosen as 


a Total incident fluence (J cm-?) b 


examples of metal and semiconductor targets, respectively, because their 
ablation rates with ultrafast pulses are well documented. The volume of 
material ablated as a function of the incident energy is shown in Fig. 2a 
for Cu and Fig. 2b for Si for various repetition rates. Figure 2c, d shows 
the number of atoms ablated per incident photon as a function of the 
pulse energy. We observe a substantial increase in ablation when the 
repetition rate is about 100 MHz or higher. Although it is not possible to 
predict the precise frequency required for each material (the toy model is 
too simple for us to expect quantitatively accurate predictions), 7) + 1 ns 
for Cu for a processing region depth of a few hundred nanometres. Given 
that increases in efficiency are predicted to begin at a tenth of the corre- 
sponding repetition rate, this prediction agrees with the experimental 
observations. The lower thermal diffusivity of Si compared with Cu is 
consistent with the increase in its ablation efficiency at 27 MHz, whereas 
the results at 1 MHz and 27 MHz are similar for Cu, implying that the 
onset of the ablation-cooled regime for Cu begins between 27 MHz and 
108 MHz. If the repetition rate is further increased, efficiency saturates 
at high pulse energies—a consequence of the expected shielding effect. 
The solution is to decrease the pulse energy, and increase the number of 
pulses and the repetition rate (for example, from 25 pulses at 108 MHz 
to 800 pulses (with 32 times lower energy) at 3,464 MHz). The amount 
of ablation remains similar (black and pink data in Fig. 2a, b), which 
means that the shielding effects have been overcome. 

To place the ablation results into context, they should be compared 
with common literature values (see Supplementary Information section 
5 for an extensive discussion). Comparison with experiments on Cu 
using 70 fs pulses with a pulse energy of up to 0.4 mJ at 800 nm (ref. 17) 
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Figure 2 | Scaling down of the pulse energy with increasing repetition 
rate. a, b, Volumes (symbols) of Cu (a) and Si (b) ablated by a single burst 
of pulses as a function of total incident energy and fluence for different 
intraburst repetition rates. The predictions of the toy model for the lowest 
and highest repetition rates in the ablation-cooled regime are also shown 
(solid lines). c, d, Ablation efficiency in terms of number of atoms of Cu 
(c) and Si (d) ablated per incident photon as a function of pulse energy and 
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pulse fluence for different repetition rates. The legend applies to all panels. 
The lower and upper limits to the data correspond to the ablation threshold 
and available laser energy, respectively. In all panels the sample size for each 
data point is 20, where the centre values represent the mean and the error 
bars represent the standard deviation. Coloured symbols highlight the onset 
of the ablation-cooled regime and (beyond 108 MHz) the inverse scaling of 
the pulse energy with repetition rate in the ablation-cooled regime. 
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Figure 3 | Ablation of hard and soft tissues. a, b, Laser removal of a section 
of human dentine obtained in the traditional regime (a, 1 kHz uniform 
repetition rate) and in the ablation-cooled regime (b, 1.7 GHz intraburst 
repetition rate). Although both ablation cooling and traditional ultrafast 
processing avoid thermal damage at sufficiently low average powers, the 
ablation-cooled regime achieves approximately six times more ablation 
despite using pulse energies that are about 12 times lower. c, d, When the 
(uniform or intraburst, respectively) repetition rate, average power and 
scanning speed are simultaneously increased by a factor of 25, the traditional 
regime of ultrafast processing results in thermal damage (c; Supplementary 
Video 4), whereas the ablation-cooled regime completely avoids thermal 
effects and achieves an ablation speed of 3mm/* min, despite using a pulse 


reveals that we obtain around 2,000 times more ablation at the same 
fluence of approximately 0.04J cm~’, which is our maximum fluence 
for the intraburst repetition rate of 3,456 MHz. Even if we imagine 
the entire burst of 800 pulses to act like a single pulse and compare 
the results with those in ref. 17 for an equal total fluence (20J cm~?, 
the highest value for which direct comparison is possible), we obtain 
about 12 times more ablation, although our pulse fluence (energy) is 
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energy that is 25 times lower (d; Supplementary Video 5). The insets in 

a-d show laser scanning microscope characterizations of the ablated holes. 
e, f, Histological images corresponding to about 1 mm’ sections, which were 
removed from a rat brain with the laser operating at an average power of 
600 mW in the traditional regime (e), showing presence of thermal damage, 
and in the ablation-cooled regime (f), showing no major thermal damage. 

g, Ablation-cooled laser removal of brain tissue at an average power of 2.7 W, 
achieving an ablation speed of 2mm? min“! and showing no major thermal 
damage. h, Bright-field optical image of a bovine cornea from which a flap 
was removed following ablation-cooled laser processing of a section 0.4mm 
below the surface. Inset, optical coherence tomography image of the section 
indicated by the rectangle. 


smaller by a factor of 800 (2,400). Comparison with another reference'® 
indicates that the efficiency of ablation in our experiments is 100 times 
higher despite using a pulse energy that is 260 times lower, when 
matching the fluence of the entire burst to that of the single-pulse 
fluence. We achieve a level of ablation that is five times higher than 
results obtained with a burst-mode laser”? that does not exploit ablation 
cooling, despite using a pulse fluence that is 165 times smaller for the 
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same burst fluence of 20J cm~*. These results conclusively demonstrate 
that the exploitation of ablation cooling increases the ablation efficiency 
by an order of magnitude while allowing the required pulse energy to 
be reduced by three orders of magnitude. 

We now focus on the reduction of undesired thermal effects in the 
ablation-cooled regime. We have performed systematic comparisons 
using high and low repetition rates of the same laser with identical 
focusing and scanning systems. Tissue removal may well be regarded 
as the ultimate test of the suppression of thermal effects because an 
increase in temperature of only a few degrees can lead to degradation. 
Hard-tissue experiments were conducted on human dentine to contrast 
the ablation-cooled regime with the traditional regime. At low average 
powers, the traditional regime (using 100\1J pulses at 1 kHz, Fig. 3a) 
and the ablation-cooled regime (25 pulses of 4\1J energy at a 1.7 GHz 
intraburst repetition rate and 1 kHz burst repetition rate, Fig. 3b) both 
provide results with negligible thermal damage (although the latter 
achieves an ablation rate four times higher). When increasing the pro- 
cessing speed by a factor of 25 with a corresponding increase in power, 
the traditional regime causes excessive carbonization (Fig. 3c), whereas 
the ablation-cooled regime does not, while achieving an ablation rate of 
3mm? min! (Fig. 3d). Every other laser, focusing and scanning param- 
eter was identical in these two experiments, showing that the thermal 
effects are greatly reduced as a result of ablation cooling. 

There are numerous applications for soft-tissue ablation , par- 
ticularly in targeting the brain*’, where the extreme precision afforded 
by a laser is of paramount importance. For this reason, we compared 
the effectiveness of ablation cooling in selective tissue removal from 
freshly harvested whole rat brains. When the average power is low, heat 
diffusion from the processing region to the surrounding tissue is low 
enough that the traditional regime avoids thermal side effects, yielding 
damage-free ablation'!. For higher powers, the ablation-cooled regime 
demonstrates a clear advantage in the reduction of thermal effects: 
although low-repetition-rate ablation causes a broad heat-affected zone 
with damaged neighbouring cells, devascularization and prominent 
tissue loss (Fig. 3e), there is no major heat damage in the ablation-cooled 
regime at the same power (600 mW) and pulse energy (3 J) (Fig. 3f). 
The corresponding ablation rate of 0.75 mm? min is eight times higher 
than when using 165 J, 180s pulses, with which a 0.55 mm? section of 
brain tissue was removed in 360s (ref. 11). With ablation cooling, ata 
much higher power of 2.7 W (432 MHz intraburst repetition rate, 27 kHz 
burst repetition rate, 16,1J per pulse), virtually thermal-damage-free 
results are obtained (Fig. 3g) at an ablation rate of 2mm? : 


6,10,14 


min’. 

Finally, we performed a flap-cutting procedure on a bovine cornea, 
as this is a realistic indicator for surgical applications®. An area several 
millimetres wide located about 0.4mm below the surface of the cornea 
was scanned with the laser and the top layer was then lifted off with 
a pair of tweezers (Fig. 3h); 24 pulses with 0.81) of energy per burst 
were used, which is a reduction by a factor of approximately 15 in 
pulse fluence compared with previous results®, This result and similar 
experiments on poly(methyl methacrylate) (PMMA) and hydrogels 
demonstrate that ablation cooling retains several of its benefits even 
when used for subsurface processing (see Supplementary Information 
section 15 for a detailed discussion). 

We conclude by pointing out three speculative future directions of 
study: exploration of the far-from-equilibrium thermodynamics of the 
ablation-cooled regime, whether a suitably sculptured coherent pulse 
train can coherently enhance nonlinear processes”° and whether similar 
benefits are possible in proton therapy, because the laser-based 
generation of bursts of protons seems to be feasible””. 


Online Content Methods, along with any additional Extended Data display items and 
Source Data, are available in the online version of the paper; references unique to 
these sections appear only in the online paper. 
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METHODS 


The majority of the experiments were performed with a customized Yb-doped 
fibre-laser, which is capable of operating in either a burst or uniform mode at a 
central wavelength of 1,035 nm. This laser and the other lasers used in the experi- 
ments are detailed in Supplementary Information section 3. In burst mode, the 
laser produces a sequence of an adjustable number of pulses (a burst) with a high 
intraburst repetition rate. The bursts are repeated at a much lower repetition rate 
(most commonly 1 kHz or 25 kHz). The intraburst repetition rate of this laser 
was designed to be switchable between 108 MHz, 216 MHz, 432 MHz, 864 MHz, 
1,728 MHz and 3,456 MHz. Lower repetition rates of 1 MHz and 27 MHz could 
be obtained by selectively picking pulses using the acousto-optic modulator that 
is used to create the bursts. In uniform mode, the laser produced evenly spaced 
pulses, typically at 1 kHz or 25 kHz. The pulse durations varied between 300 fs 
and about 1 ps depending on the pulse energy. Effort was made to keep the pulse 
durations as similar as possible when making direct comparisons between the 
burst or uniform modes. 

A discussion on the estimation of the minimum repetition rate for the ablation- 
cooled regime is available in Supplementary Information section 1d. The principle 
criterion is for the repetition rate of the laser to be faster than the rate at which thermal 
energy diffuses, or is convected in case of fluids, into the surrounding regions. The 
dimensions of the interaction volume within which the deposited laser energy 
needs to be contained can be estimated as the size of the region to be ablated by 
the subsequent pulses, which is in the range of several hundred nanometres. For 
highly conductive materials, such as Si, Cu or the ceramic coating of the thermo- 
couple in Fig. 1, we estimate 7) + 1 ns. (Commonly found values for the thermal 
relaxation times in the scientific literature pertain to linear absorption, which is 
not valid for ablation by ultrafast pulses. During ultrafast ablation a plasma state 
is formed, which greatly changes the absorption properties.) The onset of ablation 
cooling is gradual (see Supplementary Figs 1 and 3) and even a repetition rate that 
corresponds to the inverse of 107 confers some of the benefits of this regime. 
Nevertheless, we have used repetition rates that exceed 1 GHz in most of the 
experiments that contrast the ablation-cooled regime with the traditional regime. 

The preferred method for positioning the laser beam on the sample was to 
use a computer-controlled galvonometric scanner, owing to their high speeds. 
Alternatively, the sample can be repositioned using motorized stages. The scanning 
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speed was adjusted such that a single burst was incident at a given spot. The laser 
spot size was approximately 241m for most of the experiments. To characterize 
the ablation efficiency, the scanning speed was adjusted so a single pulse (in the 
traditional regime) or a single burst (in the ablation-cooled regime) was incident 
at each ablation spot to eliminate the complicated effects of crater formation 
and shape on the amount of material ablated. For the experiments that aimed to 
demonstrate a micromachining procedure, such as drilling, cutting a section of 
Cu, Si or PbZrTiO3 (PZT) or the removal of a section of dentine, brain tissue or 
cornea, multiple scans were performed. This often required the readjustment of 
the focal plane after each layer of material had been ablated. In these experiments, 
the absolute durations for the completion of the process depend on the scanning 
and refocusing parameters. To minimize the influence of such factors, all of the 
parameters pertaining to the scanning procedure were kept constant when making 
comparisons between the traditional and ablation-cooled regimes. 

Experiments that aimed to compare and contrast the ablation-cooled and 
traditional regimes were performed on nine different target materials: Si (a semi- 
conductor), Cu (a metal), a thermoelectric module, PZT (a ceramic, which loses 
its piezoelectricity, when heated), PMMA (a transparent dielectric), dentine 
(a type of hard tissue) and hydrogel, brain and cornea targets (representative of soft 
tissues). In the case of (semi-)transparent materials, it is important to ensure that 
the pulse duration and peak intensities are sufficient to initiate nonlinear absorp- 
tion. It is also essential that ultrafast pulses (<10 ps) are used to avoid well-known 
mechanisms of thermal damage during a pulse. 

The processed samples were analysed using bright-field optical microscopy, 
laser scanning microscopy, scanning electron microscopy and (in several cases) 
in situ optical coherence tomography. Histological analyses were performed used 
haematoxylin and eosin staining and DAPI staining procedures (Supplementary 
Information section 12). Soft-tissue experiments were done in accordance with the 
ethical standards of the Bilkent University Ethics Committee, Approval Number 
2013/63. PZT, hydrogel and PMMA experiments are described in Supplementary 
Information sections 7, 14 and 16, respectively. Details on all of the relevant laser 
and scanning parameters and the target material properties used in each experi- 
ment are provided in the respective sections of Supplementary Information. 
Code availability. The code used in the simulations is available on request from 
the corresponding author. 
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Sea-ice transport driving Southern Ocean salinity 


and its recent trends 


F. Alexander Haumann!?, Nicolas Gruber!’, Matthias Miinnich!, Ivy Frenger!? & Stefan Kern* 


Recent salinity changes in the Southern Ocean!~’ are among the 
most prominent signals of climate change in the global ocean, 
yet their underlying causes have not been firmly established'***. 
Here we propose that trends in the northward transport of 
Antarctic sea ice are a major contributor to these changes. Using 
satellite observations supplemented by sea-ice reconstructions, we 
estimate that wind-driven®® northward freshwater transport by 
sea ice increased by 20 + 10 per cent between 1982 and 2008. The 
strongest and most robust increase occurred in the Pacific sector, 
coinciding with the largest observed salinity changes*°. We estimate 
that the additional freshwater for the entire northern sea-ice edge 
entails a freshening rate of —0.02 + 0.01 grams per kilogram per 
decade in the surface and intermediate waters of the open ocean, 
similar to the observed freshening'~>. The enhanced rejection of 
salt near the coast of Antarctica associated with stronger sea-ice 
export counteracts the freshening of both continental shelf”!!! and 
newly formed bottom waters’ due to increases in glacial meltwater!”. 
Although the data sources underlying our results have substantial 
uncertainties, regional analyses’? and independent data from an 


Figure 1 | Effect of northward sea-ice freshwater transport on Southern 
Ocean salinity. a, b, Schematic cross-sections illustrating the effect of 
northward sea-ice freshwater transport (blue arrows) on mean ocean 
salinity (a) and on the trends over the period 1982 through 2008 (b) 

(see Methods). The red line separates the open and coastal ocean regions. 
The increasing sea-ice transport freshened the open ocean and, by leaving 
the salt behind in the coastal region (red curved arrows), compensated for 
part of the freshening by enhanced glacial meltwater input (grey arrows). 


atmospheric reanalysis support our conclusions. Our finding that 
northward sea-ice freshwater transport is also a key determinant 
of the mean salinity distribution in the Southern Ocean further 
underpins the importance of the sea-ice-induced freshwater flux. 
Through its influence on the density structure of the ocean, this 
process has critical consequences for the global climate by affecting 
the exchange of heat, carbon and nutrients between the deep ocean 
and surface waters!*"!”, 

Observations of salinity in the Southern Ocean over the past few 
decades have revealed a substantial widespread freshening in the 
surface waters of both coastal’®!* and open ocean regions””, as well as 
in the water masses formed from them!**°, In particular, the Antarctic 
Intermediate Water (AAIW) and Subantarctic Mode Water (SAMW) 
freshened at a rate between —0.01 g kg~! and —0.03g kg™' per decade 
during the second half of the twentieth century’. In the Pacific 
and Indian Ocean sectors, continental shelf waters and the Antarctic 
Bottom Water (AABW) also freshened substantially*®!°, while in 
the Atlantic this freshening was smaller*'®. These salinity changes 
have been attributed to increased surface freshwater fluxes that stem 
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White arrows in b indicate the freshening effect from both sea ice and 
land ice. Positive fluxes are defined downwards or northwards. The 
orange arrows indicate ocean circulation. The background shows the 
mean salinity (colour scale) and density (dashed black lines) separating 
Circumpolar Deep Water (CDW) from Antarctic Intermediate Water 
(AAIW) and Subantarctic Mode Water (SAMW). AABW, Antarctic 
Bottom Water. 
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either from enhanced Antarctic glacial mel or from increased 
atmospheric freshwater fluxes, as a result of an excess of precipitation 
over evaporation!». Glacial meltwater!” is the most likely cause of the 
freshened coastal waters in the Amundsen and Ross seas”!!!, but 
the freshening signal in the AABW, which is formed in this region, is 
much smaller than expected®. In contrast, the recent freshening of the 
AAIW seems to be much larger than can be explained by the simulated 
increases in the atmospheric freshwater flux by global climate models 
in the open Southern Ocean". 

Changes in northward sea-ice transport could possibly contribute to 
the widespread salinity changes in the Southern Ocean®. This process 
acts as a lateral conveyor of freshwater by extracting freshwater from the 
coastal regions around Antarctica where the sea ice forms and releasing 
it at the northern edge of the sea ice where the sea ice melts'??! (Fig. 1a). 
Despite substantial wind-driven changes in sea-ice drift over the past 
few decades*”, this contribution has not yet been quantified. Here we 
suggest that surface freshwater fluxes induced by stronger northward 
sea-ice transport are a major cause of the observed salinity changes in 
recent decades; this is corroborated by our finding that the transport 
process plays a key role in the long-term mean salinity distribution in 
the Southern Ocean. 

Our conclusions are based on basin-scale estimates of annual net 
sea-ice-ocean freshwater fluxes and the annual northward transport 
of freshwater by sea ice over the period 1982-2008. Further evidence 
is provided by our assessment of atmospheric reanalysis data” and the 
results from a regional study'*. We derived the sea-ice-related fresh- 
water fluxes by combining sea-ice concentration, drift and thickness 
data and by using a mass balance approach to determine the volume 
divergence and local change in sea ice (Methods). The sea-ice concen- 
tration is derived from satellite observations”? (Extended Data Fig. 1) 
and its thickness from a combination of satellite data** and a model- 
based sea-ice reconstruction that assimilates satellite data*> (Extended 
Data Fig. 2). The sea-ice volume divergence was computed from 
satellite-based sea-ice drift vectors”® (Extended Data Figs 3, 4) and sea- 
ice volume. From the resulting sea-ice volume budget, we estimated 
the freshwater equivalents of local annual sea-ice-ocean fluxes due to 
freezing and melting and annual lateral sea-ice transport (Methods). 

Uncertainties in these derived freshwater flux products are substan- 
tial (Methods). A major challenge arises from the need to combine 
sea-ice drift estimates from different satellites to estimate the trends. 
We addressed potential inhomogeneities and biases by vigorous data 
quality control, implementing several corrections and considering 
different time periods (Methods). A second challenge is associated 
with the relatively limited number of observations of sea-ice thickness. 
These uncertainties plus the observationally constrained range of the 
other input quantities were incorporated into our error estimates of 
the final freshwater flux product (Extended Data Tables 1, 2). In the 
Atlantic sector, uncertainties associated with the mean sea-ice thickness 
distribution dominate the uncertainty, while in the Pacific sector uncer- 
tainties are mostly caused by uncertainties in sea-ice drift. 

Our analysis reveals large trends in the meridional sea-ice freshwater 
transport in the Southern Ocean between 1982 and 2008 (Figs 1b and 2c) 
that affect the regional sea-ice-ocean freshwater fluxes (Fig. 2d). 
The annual northward sea-ice freshwater transport of 130 +30 mSv 
(1 m$v = 1,000 m3 s~! = 31.6 Gt yr; Fig. 2a; Extended Data Table 1) 
from the coastal region to the open ocean strengthened by +9 +5 mSv 
per decade (Extended Data Table 2). Here, the coastal ocean refers 
to the region between the Antarctic coast and the zero sea-ice—ocean 
freshwater flux line and the open ocean is the region between the 
zero sea-ice—ocean freshwater flux line and the sea-ice edge (Fig. 2b). 
The increased northward transport caused, on average, an additional 
extraction of freshwater from the coastal ocean of —40 + 20mm yr! 
per decade and an increased addition to the open ocean region of 
+20 + 10mm yr! per decade. 

The overall intensification occurred primarily in the Pacific sector 
where we find a vigorous northward freshwater transport trend of 
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Figure 2 | Mean state and trends of net annual freshwater fluxes 
associated with sea ice over the period 1982-2008. a, Mean sea-ice- 
induced freshwater transport. b, Mean net sea-ice-ocean freshwater flux. 
c, d, Linear trends of northward sea-ice freshwater transport (c) and net 
sea-ice-ocean freshwater flux from freezing and melting (d). Stippled 
areas are significant at the 90% confidence level using Student's t-test 
(see Methods). The arrows show the mean (a) and trend (c) of the annual 
transport vectors. The thick black lines indicate the zero sea-ice-ocean 
freshwater flux line that divides the coastal from the open ocean regions, 
the thin black lines show the continental shelf (1,000 m isobath). The grey 
lines represent the edge of the sea ice (1% sea-ice concentration) and the 
green lines show the boundaries of the ocean basins labelled. 


+14+5 mSv per decade. The trends in this sector are the most 
robust (Extended Data Table 3). Over the whole period, this change 
in the Pacific sector corresponds to an increase of about 30% with 
respect to the climatological mean in the entire Southern Ocean 
(Extended Data Table 1). The largest trends occurred locally in the 
high-latitude Ross Sea (Fig. 2c, d), where our estimated trends agree 
well with a previous study!? (Methods). The increase in the Pacific 
sector is partly compensated for by small decreases in the Atlantic and 
Indian ocean sectors. We reach similar conclusions when we consider 
only the satellite data from 1992 to 2004, that is, the period when 
they are least affected by potential inhomogeneities (Extended Data 
Table 3). 

The reason for the observed northward sea-ice freshwater transport 
and its recent trends is the strong southerly winds over the Ross and 
Weddell seas, which persistently blow cold air from Antarctica over 
the ocean, pushing sea ice northwards’. The winds over the Ross 
Sea considerably strengthened in recent decades, possibly owing 
to a combination of natural variability, changes in greenhouse gas 
concentrations and stratospheric ozone depletion’. These changes in 
the southerly winds induced regional changes in northward sea-ice 
drift®°, which are responsible for the sea-ice freshwater transport trends 
(Methods). This relation between the atmospheric circulation and 
sea-ice drift changes enabled us to independently estimate the sea-ice 
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Figure 3 | Time series of annual northward sea-ice freshwater transport 
anomalies across latitude bands. The underlying sea-ice drift data are 
based on two independent data sources: the corrected NSIDC satellite 
data (blue) and zonal sea-level pressure gradients from ERA-Interim 

data (grey; see Methods). The dashed lines show the respective linear 
regressions. The map (inset) shows the latitude bands in the Atlantic 
(69.5°S) and Pacific (71°S) sectors. 


drift anomalies using sea-surface pressure gradients along latitude 
bands from atmospheric reanalysis data” (Methods). Comparing the 
resulting northward sea-ice transport anomalies to the satellite-based 
estimates across the same latitude bands results in a similar overall 
trend (Fig. 3). Thus, this alternative approach not only corroborates 
our estimated long-term trend, but also suggests that any remaining 
inhomogeneities in the sea-ice drift data that are due to changes in 
the satellite instruments are comparably small after applying multiple 
corrections (Methods). 

To assess how the changing sea-ice-ocean freshwater flux (Fig. 2d) 
affected the salinity in the Southern Ocean we assumed that the 
additional freshwater in the open ocean region entered the AAITW 
and the SAMW formed from upwelling Circumpolar Deep Waters 
(CDW)??8 (Methods). We find that our freshwater flux trends imply 
a freshening at a rate of —0.02 + 0.01 g kg”! per decade in the surface 
waters that are transported northwards and form the AAIW and 
SAMW (Fig. 1b). Thus, the sea-ice freshwater flux trend could account 
for a substantial fraction of the observed long-term freshening in these 
water masses!**. The strong sea-ice-ocean freshwater flux trends in 
the Pacific sector (Fig. 2d) spatially coincide with the region of largest 
observed surface freshening” (Extended Data Fig. 7) and can explain 
also the stronger freshening of the Pacific AAIW compared with that of 
the Atlantic!*. A more quantitative attribution of the observed salinity 
trends to the freshwater transport trends is beyond the scope of our 
study because the observed freshening trends stem from different time 
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Figure 4 | Mean annual sea-ice-related freshwater fluxes associated with 
melting, freezing and transport over the period 1982-2008. a, Sea-ice- 
ocean freshwater flux due to melting. b, Freshwater flux associated with 
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periods, and have strong regional variations and large uncertainties 
themselves!?+, However, our data show that changes in northward 
sea-ice freshwater transport induce salinity changes of comparable 
magnitude to the observed trends. 

Our estimates in coastal regions (Fig. 2d) also help to explain the 
observed salinity changes in the AABW*, which is sourced from this 
region. Additional glacial meltwater from West Antarctica’ strongly 
freshened the continental shelf in the Ross and Amundsen seas over 
recent decades”!!! (Fig. 1b). However, the observed freshening in 
Pacific and Indian Ocean AABW was found to be much smaller than 
expected from this additional glacial meltwater®. Our data suggests 
that the freshening induced by the increasing glacial meltwater is 
substantially reduced by a salinification from an increased sea-ice to 
ocean salt flux over the continental shelf in the Pacific sector. This salt 
flux trend corresponds to a freshwater equivalent of —10 +3 mSv per 
decade, resulting from increasing northward sea-ice export from this 
region of enhanced sea-ice formation (Fig. 2c, d). In contrast, over the 
continental shelf in the Atlantic sector our data suggest a decreasing 
sea-ice to ocean salt flux, corresponding to a freshwater equivalent of 
+6 +3 mSv per decade, which may have contributed to the observed 
freshening of the newly formed Atlantic AABW* and the north-western 
continental shelf waters!®. 

The large contribution of the trends in sea-ice freshwater transport 
to recent salinity changes in the Southern Ocean is in line with the 
dominant role that sea ice plays in the surface freshwater budget in the 
seasonal sea-ice zone”’ and in the global overturning circulation’? 7!” 
in the mean state. The freshwater equivalent of the total Southern 
Ocean sea-ice melting flux (Fig. 4a) is as large as 460 + 100 mSv 
(Extended Data Table 1). On an annual basis, the vast majority of this 
melting flux is supplied by the freezing of seawater of —410 +110 mSv, 
with the remaining flux arising from snow-ice formation*° (Methods; 
Fig. 4b). Most of the sea ice is produced in the coastal region 
(—320+70 mSyv), but only about 60% of the sea ice also melts there. 
The rest, that is, 130 + 30 mSv, is exported to the open ocean (Fig. 4c). 
These mean estimates agree well with an independent parallel study’, 
which is based on the assimilation of Southern Ocean salinity and 
temperature observations (Methods). 

The process of northward freshwater transport by sea ice effectively 
removes freshwater from waters that enter the lower oceanic 
overturning cell, in particular the AABW, and adds it to the upper 
circulation cell, especially the AAIW (Fig. 1a). Through this process, 
the salinity difference between these two water masses, and thus the 
meridional and vertical salinity gradients, increase. In a steady state, 
the northward sea-ice freshwater transport of 130 +30 mSv implies a 
salinity modification of +0.15 + 0.06 g kg~' and —0.33+0.09g kg“! 


Sea-ice freshwater export (fraction) 
relative to local freezing flux (red) and imported relative to the local 


melting flux (blue) due to sea-ice induced freshwater transport (arrows). 
Black and grey lines as in Fig. 2. 
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in waters that are entering the lower and upper cell, respectively 
(Methods). The latter suggests that sea-ice freshwater transport 
accounts for the majority of the salinity difference between the 
upwelling CDW and the exiting AAIW. We estimated that the 
salinification from sea ice in waters entering the lower circulation cell 
is compensated by glacial meltwater and excess precipitation over evap- 
oration in this region in about equal parts, agreeing with the very small 
salinity difference between the CDW and AABW (Methods). 

Because salinity dominates the density structure in polar oceans”, 
our findings imply that sea-ice transport is a key factor for the vertical 
and meridional density gradients in the Southern Ocean and their 
recent changes (Fig. 1). This interpretation is consistent with the 
observation that large areas of the upper Southern Ocean not only 
freshened but also stratified in recent decades’. Increased stratification 
potentially hampers the mixing of deeper, warmer and carbon-rich 
waters into the surface layer and thus could increase the net uptake 
of CO,'*!*!7, Consequently, our results suggest that Antarctic sea-ice 
freshwater transport, through its influence on ocean stratification and 
the carbon cycle, is more important for changes in global climate!*15 
than has been appreciated so far. This implication of our findings for 
the climate system stresses the need to better constrain spatial patterns 
as well as temporal variations in sea-ice—ocean fluxes by reducing the 
uncertainties in the observations of drift, thickness and snow cover of 
Antarctic sea ice. 


Online Content Methods, along with any additional Extended Data display items and 
Source Data, are available in the online version of the paper; references unique to 
these sections appear only in the online paper. 
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METHODS 

Data. The satellite-derived sea-ice concentration is drawn from the Climate Data 
Record (CDR)’%, which comprises data from the NASA Team algorithm (NTA)?! 
and the Bootstrap algorithm (BA), as well as a merged data set. Sea-ice thickness 
data are taken from a reconstruction with the ocean—sea-ice model NEMO-LIM2 
(1980-2009), from the laser altimeter ICESat-1 (2003-2008; http://seaice.gsfc. 
nasa.gov)", as well as from ship-based observations (ASPeCt; 1980-2005; http:// 
aspect.antarctica.gov.au)**. Satellite-derived sea-ice drift data originates from the 
National Snow and Ice Data Center (NSIDC), is provided in NetCDF-format 
by the Integrated Climate Data Center (University of Hamburg) and is corrected 
by drifting buoy data (1989-2005)**. We used an alternative sea-ice drift product 
for the uncertainty estimation (1992-2003; http://rkwok.jpl.nasa.gov; hereafter 
referred to as Kwok et al.)**°°, Additionally, we used daily atmospheric sea-level 
pressure, surface air temperature and 10 m wind speed values from the ERA- 
Interim reanalysis (1980-2009, http://apps.ecmwf.int)”. We provide a detailed 
description of the data processing in the corresponding sections below. 

Sea-ice concentration. We used all three sea-ice concentration products available 
from the CDR™. If any of the grid points in either the merged, NTA or BA products 
show a sea-ice concentration of 0%, all of the products are set to 0%. We used 
a first-order conservative remapping method from the Climate Data Operators 
(CDO)*’ to interpolate the sea-ice concentration to the sea-ice drift grid. The BA 
performs better than the NTA around Antarctica as the NTA underestimates sea- 
ice concentrations by 10% or more”** (Extended Data Fig. la, b). Therefore, we 
primarily used the BA product. However, the BA potentially underestimates the 
concentration of sea ice in the presence of thin ice and leads”**. Therefore, we used 
the merged product, which should be more accurate in these regions” , to estimate 
the uncertainties. Generally, sea-ice concentration is the best constrained of the 
three sea-ice variables. Its contribution to the climatological mean flux uncertainty 
is below 1% (Extended Data Table 1). To obtain the uncertainty in the freshwater 
flux trends, we also used the NTA because differences in the trends in Antarctic 
sea-ice area between the BA and NTA have been reported”. Differences between 
the BA and NTA sea-ice concentration trends range from 10% to 20% relative to the 
actual trend (Extended Data Fig. 1c, d). The associated uncertainties in the spatially 
integrated sea-ice freshwater flux trends are about 10% (Extended Data Table 2). 
Sea-ice thickness. Sea-ice thickness data spanning our entire analysis period 
do not exist, mostly owing to challenges in remote sensing of Antarctic sea-ice 
thickness’. We therefore used a sea-ice thickness reconstruction”® from a model 
that assimilated the observed sea-ice concentration. Through this assimilation, the 
model constrained air-sea heat fluxes, improving the spatial and temporal 
variability of the sea-ice thickness. The model did not assimilate sea-ice thickness 
observations themselves. Sea-ice thickness, as we use it here, is not weighted with 
sea-ice concentration and does not include the snow layer. 

The reconstruction overestimates the sea-ice thickness in the central Weddell 
and Ross seas and underestimates it in some coastal regions compared to the 
ICESat-174 and ASPeCt*? data sets (Extended Data Fig. 2). To compare the different 
sea-ice thickness data sets, we interpolated the reconstruction, ICESat-1 and 
ASPeCt data to the sea-ice drift grid using CDO*” distance-weighted averaging. 
For our best estimate of the sea-ice freshwater fluxes, we applied a weighted bias 
correction to the reconstruction using the spatially gridded version of the ICESat-1 
data (see the following paragraph). Both the ICESat-1 and ASPeCt data sets are 
potentially biased low, particularly in areas with thick or deformed sea ice**40-”, 
where we found the largest differences between these two data sets and the 
uncorrected reconstruction. Thus, the thicker sea ice in the Weddell Sea in the 
uncorrected reconstruction might be realistic, especially when considering 
alternative ICESat-1 derived estimates for this region***?*. To capture the full 
uncertainty range associated with the mean sea-ice thickness distribution, we 
used the difference between the uncorrected reconstruction and the ICESat-1 
data. Uncertainties in sea-ice thickness dominate the climatological freshwater 
flux uncertainties in the Atlantic and Indian Ocean sectors, ranging from 10% 
to 35%, and are also substantial in all other regions and for the overall trends 
(Extended Data Tables 1, 2). 

For the correction of the mean sea-ice thickness distribution, we first calculated 
relative differences to ICESat-1 whenever data were available. Then, we averaged all 
of the differences that were within two standard deviations over time. We applied 
this average relative bias correction map to the data at each time step. To ensure 
that local extremes were not exaggerated, we used weights. Weights were one for a 
sea-ice thickness of 1.2 m, that is, the full bias correction was applied, and decreased 
to zero for sea-ice thicknesses of 0.2 m and 2.2 m, that is, no bias correction was 
applied. We derived these thresholds empirically to reduce biases with respect to 
the non-gridded ICESat-1 and ASPeCt data (Extended Data Fig. 2). Trends in 
the reconstruction remain largely unaffected by the bias correction (comparing 
Extended Data Fig. 2a and the original trend”*). 
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Local extremes in the sea-ice thickness reconstruction, caused by ridging 
events, are probably inconsistent with the observed sea-ice drift and would lead 
to unrealistic short-term variations in our final fluxes. However, when considering 
the net annual melting and freezing fluxes and averages over large areas these 
variations cancel out. To reduce the noise in our data set, we filtered extremes 
with a daily sea-ice thickness anomaly larger than 2 m with respect to the climato- 
logical seasonal cycle, representing only 0.1% of all data points. These and other 
missing grid points (in total 2.6%) were interpolated by averaging the neighbouring 
grid points. We also calculated our sea-ice freshwater fluxes on the basis of the 
unfiltered data and included these fluxes in our uncertainty estimate. 

Snow-ice formation due to flooding and refreezing*”*” is part of the estimated 
sea-ice thickness. As snow-ice forms partly from the atmospheric freshwater flux 
and not from the ocean alone, it could lead to an overestimation of the total ocean 
to sea-ice freshwater flux due to freezing. The amount of snow-ice formation 
is highly uncertain*”* but lies within the uncertainty of the sea-ice thickness. 
To account for this process, we reduced the freezing fluxes according to snow-ice 
formation estimates from the literature®. In the Atlantic, Indian Ocean and Pacific 
sectors we applied approximate snow-ice formation rates of 8 + 8%, 15 + 15%, and 
12+ 12% of the freezing flux, respectively~°. In the entire Southern Ocean, the 
amount of snow that is transformed to ice would thus amount to about 50 mSy, or 
about 35% of the suggested atmospheric freshwater flux onto Antarctic sea ice””. 

Trends in sea-ice thickness (Extended Data Fig. 2a) are highly uncertain but 

broadly agree among different modelling studies“. To show that our results 
are robust with respect to the less certain trends or short-term variations in sea-ice 
thickness, we compared our estimated transport trends across the latitude bands 
(equation (3)) with a sensitivity analysis, where we kept the sea-ice thickness 
constant. The resulting transport trends across the latitude bands of about —6 mSv 
per decade in the Atlantic sector and about +11 mSv per decade in the Pacific 
sector are still within our estimated uncertainty (Extended Data Table 2). Most 
of the sea-ice thickness trends (Extended Data Fig. 2a) occur either north (in the 
Pacific sector) or south (in the Atlantic sector) of the zero freshwater flux line or 
latitude bands. Thus, the trend in sea-ice thickness does not considerably affect the 
northward sea-ice freshwater transport trend. However, the mean sea-ice thickness 
uncertainty at the zero freshwater flux line is the largest contributor to the overall 
northward sea-ice freshwater transport trend (Extended Data Table 2). 
Sea-ice drift. We used the gridded version of the NSIDC” sea-ice drift data set. 
In the Antarctic, it is based on five passive microwave sensors“? and data from 
the Advanced Very High Resolution Radiometer (AVHRR) (Extended Data 
Fig. 4). Two studies validated this data set with buoy data in the Weddell Sea 
(1989-2005)*4 and around East Antarctica (1985-1997)°!. There is a very high 
correlation between the buoy and the satellite data on large temporal and spatial 
scales (that is, monthly and regional) and a strongly reduced agreement on smaller 
scales (that is, daily and local)**°". The satellite-derived sea-ice drift underestimates 
the sea-ice velocity given by the buoys by 34.5%”, that is, faster drift velocities have 
a larger bias”. The bias is smaller for the meridional (26.3%) than for the zonal 
drift**. We corrected for these low biases by multiplying the drift velocity by the 
correction factor (1.357) that corresponds to the meridional drift bias*“. We argue 
that the meridional component of the bias is the better estimate in the central 
sea-ice region, which is the key region for our results. Here, the drift is mainly 
meridional. The larger biases are observed in the swift, mostly zonal drift along 
the sea-ice edge that causes the larger zonal biases. The spatial dependence of the 
bias and our correction imply that larger biases and uncertainties remain in our 
final product around the sea-ice edge. 

We processed this bias-corrected drift data further: first we removed all of the 
data that were flagged as close to the coast or interpolated over large distances in 
the product; second, we removed any data with sea-ice concentrations below 50%, 
closer than 75 km to the coast™, or with a spurious, exact value of zero. Our results 
are not sensitive to this filtering but it reduces the spatial and temporal noise. After 
these modifications, about 75% of all of the grid cells covered by sea ice had an 
associated drift vector. 

We compared both the original and the bias-corrected data to a partly 
independent product by Kwok et al.3*°°, We interpolated these data onto our grid 
using CDO*’ distance-weighted averaging and applied the same 21-d running 
mean as for the NSIDC sea-ice drift data. We compared sea-ice drift vectors 
whenever both data sets were available and sea-ice concentrations were larger 
than 50%. Extended Data Fig. 3 shows the meridional drift components before 
and after applying the bias correction factor from the buoy data (Extended Data 
Fig. 3a and b, respectively). We find that the agreement between the two data sets 
is much higher after the corrections. Compared with the original NSIDC sea-ice 
drift data set, the largest improvement occurs in the slope: 1.06 compared with 
1.55. Root-mean-square (r.m.s.) differences and the linear correlation coefficient 
remain identical and the absolute bias is reduced by 0.2km d~!. The correlation 
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coefficients between the two data sets are 0.8 for both the zonal and meridional 
drift components. The spatial patterns of the mean annual sea-ice drift speed 
(Extended Data Fig. 3c-e) illustrate the improvement in agreement between 
the two data sets after the application of the bias correction but confirm that 
considerable differences remain at the sea-ice edge. These differences lead to a 
relatively high r.m.s. difference in the annual mean sea-ice drift speed in these 
regions (Extended Data Fig. 3f). However, in the central sea-ice pack—the region 
that is crucial for our results—the r.m.s. differences are much smaller. 

Our bias-corrected sea-ice drift speeds are typically slightly lower (by about 
9-19%) than those by Kwok et al. but considerably higher than in the uncorrected 
NSIDC data (about 26%, see above). We used these differences between the data 
sets to estimate the uncertainties induced by sea-ice drift on the sea-ice freshwater 
transport (Au; Extended Data Tables 1, 2). First, we recomputed all of the fluxes by 
correcting the original NSIDC data with correction factors derived from the Kwok 
et al. data (1.82 or 45% for the zonal drift, and 1.55 or 35% for the meridional drift) 
instead of the buoy-derived correction factor. In this way, we also accounted for an 
uncertainty in the drift direction. Then we averaged the deviations between our 
best estimate and the estimate based on Kwok et al.**° with those between our best 
estimate and using the uncorrected and unfiltered NSIDC data. Uncertainties from 
sea-ice drift in the freshwater fluxes are about 20%. They contribute considerably 
to the final freshwater flux uncertainty and our trend uncertainties in all regions. 
Sea-ice-ocean freshwater flux. We estimated annual net sea-ice-ocean freshwater 
fluxes over the period 1982-2008 by calculating the local sea-ice volume change 
and divergence**?. From this we derived the local freshwater fluxes F (m? s~') 
from the sea ice to the ocean due to freezing and melting on a daily basis through 
a mass balance: 


O(Ach) 


F=-C 
at 


+V -(Achu) (1) 


where the four variables c, h, u and A denote the sea-ice concentration, thickness, 
drift velocity and grid-cell area, respectively. The factor Cjy converts the sea-ice 
volume flux to a freshwater equivalent™*: 


Picel 1 = Sice/ Sow) 


Chw = (2) 

Piw 
Here, pices Sices Ssw and pf are the sea-ice density (925 kg m *)°°, the sea-ice salinity 
(6g kg 1)°S, the reference seawater salinity (34.7 g kg ')* and the freshwater 


density (1,000kg m~), respectively. 

The annual sea-ice freshwater fluxes were computed from the daily fluxes from 

March to February of the next year (that is, March 1982 to February 2009), which 
correspond to the annual freezing and melting cycle of sea ice in the Southern 
Ocean®. Remaining imbalances between, for example, the open and coastal ocean 
of the Atlantic sector (Extended Data Tables 1, 2) are due to multiyear sea ice in 
the coastal region. We performed all of the calculations on the grid of the sea-ice 
drift data”° and averaged all data products over 3 x 3 grid boxes, resulting in a 
nominal resolution of 75 km. To obtain the zero freshwater flux contour line, we 
averaged the climatological fluxes over 9 x 9 grid boxes. To estimate the melting 
and freezing fluxes, we separately summed up the positive and negative daily fluxes 
over a year (Fig. 4a, b). As temporal fluctuations accumulate when only adding 
positive or negative values, noise can lead to an overestimation of these fluxes. 
Each of the sea-ice variables (c, h and u) were therefore low-pass filtered using a 
21-d running mean. 
Sea-ice freshwater transport. The total northward sea-ice volume transport 
(in m3 s~') between the coastal and open ocean regions equals the spatial integral 
of the divergence term in equation (1) in either of the two regions (by Gauss’s 
theorem). We chose the open ocean region because there is considerable zonal 
exchange between the Indian Ocean and Atlantic sectors (Fig. 2a) in the coastal 
region, influencing the sector-based estimates. In the open ocean, this effect is 
negligible. We used this approach for the reported transport estimates (Extended 
Data Tables 1-3 and Extended Data Fig. 5a—c). 

To demonstrate that our main findings are robust on the basin scale, and not 
influenced by small-scale noise and local uncertainties, we also calculated the 
northward sea-ice freshwater transport across latitude bands at 69.5°S in the 
Atlantic sector and 71°S in the Pacific sector (Fig. 3). To this end, we averaged c,, 
h,, and meridional drift (v,) in 1° longitude segments (1) along these latitudes and 
calculated the local freshwater transport T;, (m? s~!): 


Th = CiwenhnVnAln (3) 


where Al, denotes the length of sectors n along the latitude bands. The com- 
bined annual northward freshwater transport of both sectors is 100 + 30 mSv 
with an increase of 8 +5 mSv per decade over the period 1982-2008 (Extended 


Data Fig. 5d and Fig. 3). This compares well with the mean (120 + 30 mSv) and 
trend (9 +5 mSv per decade) of our spatially integrated sea-ice-ocean fluxes in 
the Pacific and Atlantic (Extended Data Fig. 5b, c). 

We calculated the spatial pattern of the sea-ice freshwater transport f (m? s~!) 
as displayed in Fig. 2a, c, according to: 


f= Crychu (4) 


Time-series homogenization. Our analysis and earlier studies”*” revealed major 
temporal inhomogeneities in the NSIDC sea-ice drift data set at the transitions 
between satellite sensors (Extended Data Fig. 4). We argue that these temporal 
inhomogeneities are linked to the unavailability of the 85 GHz and 91 GHz 
channels and sparser data coverage in the earlier years. The drift speed before 
1982 seems to be underestimated, which is to some extent mitigated by AVHRR 
data thereafter. From 1982 to 1986, the drift speed is consistent but has a low 
bias. The drift ramps up in 1987, when the 85 GHz channels became available, 
and decreases again between 1989 and 1991, when these channels degraded™. 
A final sudden decrease occurs from 2005 to 2006 when 85 GHz data were not used. 
We used wind speed data over the sea ice from ERA-Interim”’ as an independent 
data source and scaled it to the sea-ice drift velocity for comparison (Extended 
Data Figs 4b). The scaling factor stems from the consistent years in the period 
1988-2008 and varies in space and with the season”. This analysis supports our 
argument that the sea-ice drift speed is underestimated when the higher resolution 
85/91 GHz channels were not available. We note that the meridional drift seems less 
sensitive to these inhomogeneities than the total drift, which might be related to 
higher data availability in the central sea-ice pack and is consistent with the lower 
biases found in the meridional sea-ice drift. 

Spurious increases in the sea-ice velocity would affect our estimated trends if 
they were not taken into account (Extended Data Figs 5, 6). Thus, we corrected 
the annual divergence (equation (1)) and lateral transport (equations (3), (4)) 
for the sensor-related temporal inconsistencies as follows. We excluded the 
inconsistent years (1980, 1981, 1987, 1989-1991, 2005 and 2006) from the analysis. 
To homogenize the years 1982-1986 with the years 1988-2008, that is, to remove 
the spurious trend in 1987, we first calculated linear regression lines before and after 
1987 at each grid point. Then we added the differences between the end (1986) and 
start (1988) points of the regression lines to all years before 1987, that is, assuming 
a zero change in 1987. Fitting regressions before and after spurious jumps is a 
common procedure to homogenize climate data®?. Here, we used a linear 
regression that serves the purpose of computing long-term trends in the time series. 

To estimate the sensitivity of the trends in northwards sea-ice freshwater 
transport to the uncertainties associated with the offset correction before 1987 
(shown in orange and green in Extended Data Fig. 5), we performed a Monte Carlo 
analysis by varying the offset and estimating the resulting trends. We generated 
10,000 normally distributed offsets around our best guess (about 19 +5 mSv 
for the entire Southern Ocean; Extended Data Table 3). The standard deviation 
of this distribution was chosen to match the offset uncertainty that arises from 
the r.m.s. errors of the trends in each of the two time intervals: 1982-1986 and 
1988-2008. For each of these generated offsets, we then estimated the trends and 
their significance (Extended Data Table 3). For both the entire Southern Ocean 
and the Pacific sector, all of the sampled offsets yield a positive northward sea-ice 
freshwater transport trend. All trends for the Pacific sector and 92% of those for 
the entire Southern Ocean are positive and at the same time significant at least ata 
90% confidence level using Student's t-test. Thus, our trend results are insensitive 
to uncertainties in the applied homogenization at the 90% confidence level. The 
posterior uncertainty shows that the uncertainty associated with the offset has no 
noticeable effect on the total uncertainty range, that is, is smaller than +1 mSv 
per decade. 

Uncertainty estimation. The uncertainties of the local (grid-point-based) fluxes 
and timescales shorter than one year are probably large due to potential inconsist- 
encies between the data sets on such scales and an amplification of the uncertainties 
by the spatial and temporal differentiations in equation (1). Integrating these terms 
in space and time greatly reduces these uncertainties (Extended Data Tables 1, 2). 
We estimated the uncertainties in our product that are associated with the 
underlying input variables c, h and u by using their observationally constrained 
ranges from different data sources, including the applied corrections and filtering 
as described. Additionally, we used an averaging period of 31 d (instead of 
21 d) and, for trends only, an estimate without a running-mean filter, to obtain 
uncertainty estimates associated with temporal noise (At). The results confirmed 
that the annual melting or freezing fluxes, are sensitive to the low-pass filtering, but 
not the net annual fluxes, as in the latter product the noise is averaged out. The sen- 
sitivity of the spatially integrated values to variations of the zero freshwater flux line 
is estimated by varying the smoothing radius from two to six grid boxes (AA). The 
uncertainty associated with the constant conversion factor (AC;y; equation (2)) 
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is about 5% when using a realistic range of values”*°>°°, For the trends only 
we computed the standard error of the slope from the variance of the residuals 
around the regression line (As,)®. The total uncertainty for both the climatological 
mean and the trends was estimated by calculating the r.m.s. of the individual 
contributions. This analysis shows that in the Atlantic and Indian Ocean sectors 
both the uncertainties in the climatology and trends (Extended Data Tables 1, 2) 
are dominated by uncertainties in the sea-ice thickness. In contrast, the uncertainty 
in the sea-ice drift dominates the uncertainty in the Pacific sector. We tested the 
significance of the trends with Student's t-test, accounting for the fact that only 
21 out of 27 years were used and for a lag-1 autocorrelation®. To indicate the 
significance of the trends at grid-point level (Fig. 2c, d and Extended Data Fig. 6), 
at which the data uncertainties are unknown, the local r.m.s. of the variance of the 
residuals was artificially increased by 40%, approximately corresponding to our 
data uncertainty estimate in Extended Data Table 2. The quality of our data directly 
at the coastline and around the sea-ice edge is reduced due to the limited quality 
and quantity of the underlying observations in these regions. 

Sea-ice freshwater flux evaluation. A modelling study”’ carried out in parallel 
to this study calculated freshwater fluxes associated with sea-ice formation, 
melting and transport in the Southern Ocean State Estimate (SOSE). This model 
assimilates a large amount of observational data and optimizes the surface fluxes. 
They estimated an annual sea-ice—ocean freshwater flux due to sea-ice formation 
of —360 mSv over the entire Southern Ocean, which is within our estimated range 
of —410 + 110 mSv. Moreover, they estimated that the combined annual sea-ice- 
ocean freshwater flux due to sea-ice and snow melting is about 500 mSv. Thus, in 
their estimate a total of 140 mSv of snow accumulated on the sea ice. Our estimates 
partly include snow accumulation on sea ice, because part of the sea-ice thickness 
results from snow-ice formation, which we estimated to be about —50 mSv (section 
on sea-ice thickness). However, the snow layer on top of the sea ice is not included 
in our estimate of the freshwater flux due to sea-ice melting of 460 + 100 mSyv. In 
that study’’, the authors estimate that the lateral sea-ice freshwater transport from 
the density class of the CDW to the AAIW and the SAMW amounts to 200 mSv 
in the period between 2005 and 2010. Their estimate slightly differs from our 
estimated transport from the coastal to the open ocean, which ranges between 
about 140 mSv and 160 mSv in 2007 and 2008 (Extended Data Fig. 5). The reasons 
might be the slightly different regions and that their estimate also includes the 
transport of the snow layer on top of the sea ice. 

Given the reduced confidence in the local fluxes (for example, sea-ice 
production in coastal polynyas), it is reassuring that our data agree within our 
estimated range of uncertainty with previous estimates of mean fluxes for some 
larger coastal polynya regions®®°. Our confidence is higher for fluxes integrated 
over larger regions, such as the high-latitude Ross and Weddell seas (Extended 
Data Fig. 5e). Here our estimates are in close agreement with previous studies. 

In the Ross Sea, we estimated that the northward transport from the coastal 
region across a flux gate between Land Bay and Cape Adare* (the turquoise 
area in Extended Data Fig. 5e) is 23 +5 mSv, increasing by about 30% 
(or +7 +4 mSy) per decade in the period 1992-2008. On the basis of the same 
passive microwave data, but using a different algorithm for retrieving the sea-ice 
motion data, two studies*®* found a mean sea-ice area flux across this flux gate 
of about 1,000,000 km? between March and November in the periods 1992-2003 
(ref. 36) and 1992-2008 (ref. 66), respectively. Using an approximated mean sea-ice 
thickness (0.6m) and the conversion factor (equation (2)), this corresponds 
to a mean northward freshwater transport of about 19 mSv. In close agreement 
with our estimate, these studies found an increase of 30% per decade (about 
+6 mSv per decade). Another study’, using sea-ice motion from the Advanced 
Microwave Scanning Radiometer-EOS (AMSR-E), estimated that the mean sea- 
ice area flux between April and October (2003-2008) across the same flux gate 
is about 9.3 x 10°km? corresponding to a freshwater transport of about 23 mSv. 
Using the same data, but an alternative approach”, they found that the total sea-ice 
production in all of the Ross Sea polynyas together was about 737 km? between 
April and October (2003-2008), corresponding to a sea-ice-ocean freshwater flux 
of —31 mSv. This estimate is similar to the total production of about —36 +7 mSv 
south of the flux gate in our data set, because most of the sea-ice production in this 
region occurs in the polynyas!°, Using passive microwave data, the same study! 
found an increase of the production in the Ross Sea polynyas of 28% per decade 
between 1992 and 2008. A modelling study® found a net annual sea-ice-ocean 
freshwater flux due to melting and freezing of —27 mSv on the continental shelf 
in the Ross Sea, which is in agreement with our estimate of —23 + 5 mSv. They 
also found a long-term (unquantified, see figure 9b in ref. 68) decrease in the net 
annual sea-ice-ocean freshwater flux over the Ross Sea continental shelf in the 
period 1963-2000, which is qualitatively in line with our results. 

In the Weddell Sea, the northward sea-ice area flux across a flux gate close 
to the 1,000 m isobath (blue area in Extended Data Fig. 5e) has been found to 
be 5.2 x 10°km? on the basis of AMSR-E data between April and October 
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(2003-2008)!3. Using an approximated mean sea-ice thickness (0.75 m)!¥ and the 
conversion factor (2), this corresponds to a mean northward freshwater transport 
of about 16 mSv. This agrees well with our estimate of an annual northward 
transport of 16 +4 mSv for the same years and the same region. Similar to the 
Ross Sea, production in the major polynyas of the Weddell Sea was estimated'’. 
However, in the Weddell Sea, a large fraction of the sea-ice transported across the 
flux gate is not produced in the coastal polynyas’*; thus we cannot directly compare 
our large-scale estimate to the sea-ice production in the polynyas. In the same 
study? , based on passive microwave data, they found a small, but insignificant 
long-term decrease in the sea-ice production in the Weddell Sea polynyas between 
1992 and 2008, which is qualitatively consistent with our findings in the Atlantic 
sector. For a much larger area in the Weddell Sea, a modelling study® estimated 
an annual northward sea-ice freshwater transport of about 34 mSv and another 
observational study”’, mostly based on moorings and wind speed, estimated that 
this flux is as large as about 38 + 15 mSv. These estimates agree well with our 
finding of an annual northward freshwater transport of 41 + 18 mSv across the 
69.5°S latitude band, which is approximately their considered transect. 
Sea-ice freshwater transport based on ERA-Interim data. To support our 
findings, we quantified the changes in sea-ice motion that are induced by changes 
in geostrophic winds”! from daily ERA-Interim’ sea-level pressure and sur- 
face air temperature data. We averaged the data over 1° longitudinal segments along 
the previously defined latitude bands (Fig. 3), computed 21-d running means, 
and smoothed the data spatially over seven longitudinal bins. Then we calculated 
the sea-level pressure gradients along the latitude bands and used these together 
with the atmospheric surface density to estimate geostrophic winds normal to 
the latitude bands*””!. From these, we calculated the sea-ice drift speed using 
a drift-to-wind-speed ratio of 0.016, derived from drifting buoys in the central 
Weddell Sea®””!. This parameter is strongly variable in space and time, which is a 
major uncertainty in the resulting sea-ice drift. Nevertheless, it provides an average 
estimate for the mostly free drifting sea ice in the central Antarctic sea-ice pack*™”1. 
The resulting northward sea-ice freshwater transport (equation (3)) is inde- 
pendent in terms of the sea-ice drift but not in terms of the sea-ice concentration 
and thickness. We used anomalies (at each 1° increment) because the absolute 
values of the local transport are likely to be biased by the local influences of ocean 
currents and sea-ice properties. The resulting total annual anomalies of the north- 
ward sea-ice freshwater transport agree well in terms of the variability and long- 
term trend with the transport anomalies based on the satellite sea-ice drift data 
(+8 mSv per decade; Fig. 3). These estimates do not suffer from the temporal 
inhomogeneities that we identified in the satellite sea-ice drift data (see Methods 
section “Time-series homogenizatiom). 
Sea-ice contribution to ocean salinity. We determined the evolution of ocean 
salinity s (g kg~') in response to a given value of F (m? s~') from a combination 
of mass and salt balances. The mass balance for a given well-mixed ocean surface 
box of volume V and density p reads: 


dpV 
_ = Pin Qin + Pew — PQout (5) 


where Qin and Qout (m? s~!) are the volume fluxes of seawater in and out of the 
box, pin (kg m3) is the respective density. In a steady state, equation (5) yields: 


Pin Qin = PQout | PiyF (6) 
The corresponding salt balance reads: 


ds 
ae = PigQinSin — PQoutS (7) 
We assumed the same constant source water salinity sin = Sg and fy as in equation 
(2) and used a constant reference density (9 = 1,027 kg m°). Moreover, we used 
the formation rate of the modified water mass as the volume flux of seawater out of 
the surface box (Qout = Q). Then, substituting equation (6) into equation (7) yields: 


pv = (pQ — pyyF)Ssw — PQs (8) 


In a steady state, this results in an equation that describes the modified salinity s 
as follows: 


PQs = (PQ — PiwF)Ssw (9) 


Using s = ssw + As, where Ass is the difference in salinity between the source and 
modified water masses, equation (9) reduces to: 


As = — PiwSew? 


pQ i) 
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We used net water-mass formation rates (Q) of 29 Sv for formation of the AABW 
from the CDW and 13 Sv for the formation of the AAIW/SAMW from the CDW”’. 
Figure 1a illustrates the results and shows the zonal mean ocean salinity and density 
distribution” for comparison. 

Assuming that +130 + 30 mSv of freshwater enter the CDW through north- 
ward sea-ice freshwater transport, the salinity modification between the CDW 
and AAIW/SAMW (using equation (10)) is —0.33 + 0.09 g kg !. The uncertainty 
includes a +2 Sv uncertainty in the water-mass formation rate. In observations, 
the salinity difference between the CDW and the AAIW and SAMW ranges from 
about —0.3g kg! to —0.5g kg! (ref. 28). Thus, northward freshwater transport 
by sea-ice could explain the majority of the salinity modification, consistent with 
very recent findings”’ and a mixed-layer salinity budget”’. 

Similarly, we calculated the contribution of — 130 + 30 mSv of freshwater 
removed from coastal regions due to northward sea-ice transport to the salinity 
modification (using equation (10)) between the CDW and AABW, obtaining an 
increase of +0.15 + 0.06g kg”. The uncertainty includes a +7 Sv uncertainty in the 
AABW formation. However, the observed salinity differences between the CDW 
and AABW are generally small or even of opposite sign”*. This is the result of a 
compensating effect between a sea-ice-driven salinification and a freshening from 
glacial and atmospheric freshwater. The freshwater fluxes from land ice through 
basal and iceberg melting are about +-46 + 6 mSv and +42 +5 mSy, respectively”’. 
Assuming that roughly 60% of the icebergs melt in the coastal regions”, a total of 
about +70 mSv are added from the land ice to the coastal ocean, corresponding 
toa freshening of about —0.08 g kg! or a compensation of the sea-ice freshwater 
flux of about 55% in the AABW. We estimated from the ERA-Interim atmospheric 
reanalysis data” that the net atmospheric freshwater flux in the coastal region is 
about +80 mSv, corresponding to a freshening of about —0.09 g kg”. The resulting 
net salinity change in coastal waters from sea-ice, atmospheric and land-ice fresh- 
water fluxes is almost zero (—0.02 g kg). Such a compensation of the freshwater 
fluxes in coastal regions was noticed previously®™”’. We note that large regional 
variations of these fluxes have been reported’>”*. 

To estimate the temporal salinity changes at the surface and in the newly formed 
AAIW and SAMW, we assumed a constant value of Q and that the freshwater flux 
and ocean salinity consist of a climatological value plus a time-dependent 
perturbation (F + F’ and § + s’, respectively). Equation (8) then yields: 


ds! 


dt 


pV PQSsw — PyySswE — PQS — PyySswE” — pQs’ (11) 
As the climatological fluxes are in steady state, the first three terms on the right side 
in equation (11) cancel according to equation (9), resulting in: 
ds’ 
pV—_=- Pt wSswE" pQs’ 


dt ve 


We approximated the freshwater flux perturbation (F’ = at) using our estimated 
trend a, and rearranged the terms resulting in a first-order linear differential 
equation: 


ds’ 
dt 


_ PrwSsw4 t 


i (13) 
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Integration in time yields an expression for the time-dependent evolution of the 
salinity perturbation: 


Q 
r Paine Lae rev (14) 


pQ Q°Q 


To obtain an estimate of the salinity trend at a given time f, we substituted equation (14) 
into equation (13) as follows: 


dy. Pari [e¥ ] (15) 


dt pQ 


The equilibrium response of the system, that is, the long-term trend after several 
years of perturbation, is: 


hn 2 Ps 


too dt PQ ae) 


Using our estimated sea-ice freshwater transport trend (a) of +9 +5 mSv per 
decade and an AAIW/SAMW water-mass formation rate as above, we obtained 
an equilibrium freshening rate of —0.023 + 0.014g kg! per decade (green in 
Extended Data Fig. 7b), which is valid for sufficiently large values of Qt/V. 


Extended Data Fig. 7b (in purple and blue; using equation (14)) shows that if we 
assumed that the trend started in 1982, there would be a delayed response lowering 
the mean salinity trend estimate depending on V. We thus tested the sensitivity of 
the trend to V, which corresponds to the upper 150m between the zero sea-ice- 
ocean freshwater flux line and the Subantarctic Front”? (Extended Data Fig. 7a), 
which is the source region of the AAIW. The circumpolar V of about 5 x 10°km? 
results in a mean salinity trend (using equation (14)) of —0.014 + 0.008 g kg! 
per decade between 1982 and 2008 (purple). However, the AAIW formation 
does not occur in a circumpolar belt but mostly in the south-eastern Pacific and 
north-western Atlantic, that is, on either side of Drake Passage*” “, Assuming that 
most of the water is modified in this region and further downstream in the South 
Pacific®®828+, we estimated a second, somewhat smaller V of about 2 x 10°km? 
(shown in blue). The sea-ice freshwater transport trend into this reference volume 
is about +8 + 5 mSv per decade (Figs 2c, d), resulting in a mean salinity trend 
(using equation (14)) of —0.018 + 0.010g kg! per decade (blue); because a certain 
amount of freshwater is transported eastwards out of this sector (blue), the mean 
trend of the delayed response lies somewhere in between the estimates based on 
the two different reference volumes (blue and purple). 

It is unlikely that the trend started exactly in 1982. Thus, the actual salinity 
response will fall between our estimated delayed response and the equilibrium 
response. For the range of values above, the deviations in the freshening rate due 
to effects of a delay and variations in the reference volume are much smaller than 
the actual magnitude of the trend itself. We thus conclude that the overall mean 
freshening rate of the newly formed AAIW and the surface waters advected north- 
wards across the Subantarctic Front into the SAMW due to the changes in sea-ice 
freshwater transport is about —0.02 + 0.01 g kg! per decade (Fig. 1b). 

Data deposition. Sea-ice freshwater fluxes leading to the main conclusions are 
publicly available (http://dx.doi.org/10.16904/8). Other presented data are available 
from the corresponding author upon request. 

Code availability. Climate Data Operators (CDO; version 1.6.8) used for part 
of the analysis is publicly available (http://www.mpimet.mpg.de/cdo). Other 
analytical scripts are available upon request from the corresponding author. 
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Extended Data Figure 1 | Uncertainties and trends in Antarctic sea-ice 
concentration over the period 1982-2008. a, BA minus CDR merged 
data. b, NTA minus CDR merged data. c, Decadal trends of the BA 
sea-ice concentration. Stippled trends are statistically significant (at a 90% 
confidence level or higher using Student's t-test). d, Decadal trends of the 
BA minus NTA data. The thick grey line marks the mean sea-ice edge 
(1% sea-ice concentration). See Methods for details. 
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Extended Data Figure 2 | Mean, trend and uncertainty of the Antarctic ICESat-1 data (2003-2008). g, Mean of the ASPeCt data (1980-2005). 
sea-ice thickness. a, Decadal trends of the corrected reconstruction h, Reconstruction minus ASPeCt data (1980-2005). i, Corrected 
(1982-2008). Stippled trends are statistically significant (at a 90% reconstructions minus ASPeCt data (1980-2005). The thick grey line 
confidence level or higher using Student's t-test). b, Mean of the marks the mean sea-ice edge (1% sea-ice concentration). Differences are 
reconstruction (1982-2008). c, Mean of the corrected reconstruction based on data points when both respective products were available. Data 
(1982-2008). d, Mean of the non-gridded ICESat-1 data (2003-2008, points without data in the sea-ice-covered region are shaded in grey in d-i. 
13 campaigns). e, Reconstruction minus non-gridded ICESat-1 See Methods for details. 


data (2003-2008). f, Corrected reconstruction minus non-gridded 
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Extended Data Figure 3 | Sea-ice drift speed comparison between the line. c-e, Mean sea-ice drift speed of the original (c) and bias-corrected 
NSIDC and Kwok et al. data for the period 1992-2003. a, b, Low-pass NSIDC (d) and Kwok et al. (e) sea-ice drift speed. The arrows denote the 
filtered, 21-d running mean for the original (a) and bias-corrected (b) drift vectors. f, R.m.s. differences between the annual mean bias-corrected 
daily meridional NSIDC sea-ice drift speed compared with the low-pass NSIDC and Kwok et al. sea-ice drift speed. The thick grey line in c-f 
filtered daily meridional Kwok et al. data. Contours mark the number of marks the mean sea-ice edge (1% sea-ice concentration). Data points were 


grid boxes and the blue line marks the fitted least squares linear regression | compared when both data sets were available. See Methods for details. 
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Extended Data Figure 4 | Temporal inhomogeneities in the NSIDC 
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satellite sea-ice drift data. a, Annual mean meridional sea-ice drift 
speed averaged over the entire sea-ice area (sea-ice concentration 
>50%). The thick orange lines show the spurious trends due to changes 
in the underlying data. The black lines shown the data corrected for 
inconsistencies and used in this study (1982-2008). b, Low-pass filtered 


(91 drunning mean) sea- 


ice drift speed averaged over the entire sea-ice 


area (sea-ice concentration >50%). The grey lines show the reduced wind 


speed from ERA-Interim 


using a reduction factor from the period 


1988-2008. The uncorrected data for each satellite instrument 
combination are shown in colour (dashed lines show the mean over 

the respective period). The black vertical lines show the periods of the 
channels. The coloured text denotes the sensors and the frequency of the 
microwave radiometer channels used. c, The fraction of sea-ice covered 
grid boxes with at least one drift vector observation in a 21-d window 
anda 75km x 75km grid box using the non-gridded NSIDC drift data. 
The colours indicate the contribution of each sensor and channel. 

d, Different combinations of instruments and passive microwave sensor 
channels and the related periods underlying the NSIDC sea-ice drift data. 


See Methods for details. 
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Extended Data Figure 5 | Time series and regions of annual northward 
sea-ice freshwater transport. a—c, Transport from the coastal ocean 

to the open ocean region in the Southern Ocean (a), Atlantic sector 

(b) and Pacific sector (c). d, Transport across latitude bands in the 
Atlantic (69.5°S) and Pacific (71°S) sectors. Orange indicates transport 
estimates if temporal inhomogeneities were not accounted for. Blue shows 
homogeneous years only. Green represents homogenized time series. Years 
that have been corrected or removed are shaded in grey. Straight lines 
show the linear regressions for the periods 1982-2008 (dashed orange 

and green), 1982-1986 (solid orange) and 1988-2008 (homogeneous 


Across latitude bands: Atlantic (69.5° S) + Pacific (71° S) 
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years only; solid blue). See Methods for details. e, Regions used for the 
evaluation of the sea-ice freshwater fluxes. Turquoise shading indicates 
the area south of the coastal Ross Sea flux gate!*7°**, Dark blue shading 
highlights the area south of the coastal Weddell Sea flux gate’. Purple 
lines are the 69.5°S latitude band in the Atlantic sector and the 71°S 
latitude band in the Pacific sector. The black line shows the smoothed 
mean zero sea-ice-ocean freshwater flux line that divides the coastal and 
open ocean regions (see Methods). The thick grey line shows the mean 
sea-ice edge (1% sea-ice concentration) and the green lines mark basin 
boundaries. 
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Extended Data Figure 6 | Trends of the net annual freshwater fluxes 
associated with sea ice over the period 1982-2008 if temporal 
inhomogeneities in the sea-ice drift data were not considered. 

a, b, Linear trends in the meridional sea-ice freshwater transport (a) and 
the net sea-ice-ocean freshwater flux from freezing and melting (b). The 
arrows in a denote the trend of the annual transport vectors. Stippled 
trends are significant at the 90% confidence level using Student's t-test 
(Methods). Thick black lines show the zero sea-ice—ocean freshwater flux 
line used to divide the coastal from the open ocean regions; the thin black 
lines mark the continental shelf (1,000 m isobath) the grey lines show the 
sea-ice edge (1% sea-ice concentration) and the green lines indicate the 
basin boundaries. 


© 2016 Macmillan Publishers Limited, part of Springer Nature. All rights reserved. 


LETTER 


LETTER 


S 


—0.04 Circumpolar 


alinity perturbati 


Subantarctic 
Front 


180% 


Extended Data Figure 7 | Contribution of sea-ice freshwater flux trends 
to ocean salinity. a, Map showing the regions used for the estimation of 
salinity changes due to sea-ice freshwater fluxes. The blue lines show the 
sector important for AAIW formation (167° E to 23° W). The purple line 
is the Subantarctic Front’®. The black line indicates the smoothed mean 
zero freshwater flux line that divides the coastal and open ocean regions. 
The thick grey line is the mean sea-ice edge (1% sea-ice concentration). 

b, The salinity response to a freshwater flux perturbation using the 
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long-term equilibrium response (green) and using a delayed response 
starting in 1982 for a circumpolar reference volume (5 x 10°km’; 
purple) or for the region of most AAIW formation (2 x 10°km/?; blue). 
See Methods for details. Dashed lines show the respective asymptotic 
equilibrium response. The black lines are the respective current trends. 
The grey shading shows the approximate observed long-term trend in the 
AAIW’**. c, Observed long-term sea-surface salinity trends (data from 
ref. 85). 
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Extended Data Table 1 | Mean and uncertainties of the annual 
sea-ice freshwater fluxes over the period 1982-2008 


Southern Ocean: 
Transport 

Net open ocean 

Net coastal ocean 
Net continental shelf 
Total melting 

Total freezing 


Atlantic sector: 
Transport 

Net open ocean 

Net coastal ocean 
Net continental shelf 
Total melting 

Total freezing 


Indian Ocean sector: 
Transport 

Net open ocean 

Net coastal ocean 
Net continental shelf 
Total melting 

Total freezing 


Pacific sector: 
Transport 

Net open ocean 

Net coastal ocean 
Net continental shelf 
Total melting 

Total freezing 


Flux 


At DA Ac Mh 


Au ACiw 
[mSv] [mSv] [mSv] [mSv] [mSv] [mSv] [mSv] 


+130 +30 
+130 +30 
-130 +30 
-60 +20 
+460 +100 
-410 +110 


+60 +20 
+60 +20 
-50 +20 
-20 +5 
+180 +40 
-160 +40 


+10 +5 
+10 +5 
-10 +6 
-10 +4 
+70 +30 
-70 +30 


+60 +20 
+60 +20 
-60 +20 
-30 +9 
+200 +50 
-180 +60 


+0 +5 +0 +16 
+0 +5 +0 +16 
+0 +5 +0 +14 
+0 +0 +0 +8 
+37 - +1 +74 
+37 - +1 +73 


41700-40443 
H7 = 40443 


+25 


+12 
+12 
+13 

+6 
+23 
+24 


Positive numbers indicate a freshwater flux into the ocean or northward transport 
(1 mSv = 103m s~!). The final uncertainty estimate (95% confidence level) stems from the 


uncertainties in the filtering of high-frequency temporal noise (Ad), variations of the zero 
freshwater flux line (AA), sea-ice concentration (Ac), sea-ice thickness (Ah), sea-ice drift (Au) and 
the freshwater conversion factor (ACfw), respectively. See Methods for details. See Fig. 2 for the 


definition of regions. 
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Extended Data Table 2 | Decadal trends of the annual sea-ice 
freshwater fluxes and their uncertainties over the period 1982-2008 
Flux As, At DAA Ac Ah Au AC 


[mSv [mSv [mSv [mSv [mSv [mSv [mSv [mSv 
dec] dec™*] dec™'] dec™'] dec™*] dec™'] dec™'] dec™*] 


Southern Ocean: 


Transport #945 43.2 40.3 +1.1 +0.8 43.0 41.9 +0.5 
Net open ocean #1045 43.5 404 41.1 408 +3.0 +2.0 +0.5 
Net coastal ocean 1045 43.5 40.2 +1.1 +0.7 43.3 41.1 40.5 
Net continental shelf “342 +18 40.0 +00 +401 +08 +01 +0.1 
Atlantic sector: 

Transport 445 +443 401 +07 +01 +14 +407 +0.2 
Net open ocean 445 +444 #401 +07 +01 +414 +407 +0.2 
Net coastal ocean +646 +5.7 +01 +07 +00 +406 +18 +03 


Net continental shelf #643 42.55 +400 +00 +00 +406 +16 +03 


Indian Ocean sector: 


Transport -141 41.3 400 +02 +01 +403 +402 +0.0 
Net open ocean -141 41.3 400 +402 +01 +403 +02 +0.0 
Net coastal ocean “342 +09 +00 +402 +401 +1.1 +0.7 +0.1 
Net continental shelf +241 +409 +401 +00 +01 +03 +04 +0.1 


Pacific sector: 


Transport 91445 43.4. 202 406 40.7 413 228 40.7 
Net open ocean #1445 #434 #403 405 +07 41.2 +2.9 +0.7 
Net coastal ocean -13+5 436 402 +05 +06 +19 423 +40.7 
Net continental shelf -10+3 42.6 40.1 +0.0 +02 +412 41.8 +0.5 


Positive numbers indicate a freshwater flux trend into the ocean or a northward transport trend 

(1 mSv per decade = 103 m3 s~! per decade). The final uncertainty estimate (95% confidence level) 
stems from the standard error of the slope of the regression line (As,), filtering of high-frequency 
temporal noise (At), variations of the zero freshwater flux line (AA), sea-ice concentration (Ac), 
sea-ice thickness (Ah), sea-ice drift (Au) and the freshwater conversion factor (ACjw), respectively. 
Bold numbers indicate a significance of at least 90% confidence using Student's t-test. See 
Methods for details. See Fig. 2 for the definition of the regions. 
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Extended Data Table 3 | Sensitivity of the northward sea-ice 


freshwater transport trend to time periods and homogenization 


Indian 


Southern = Atlantic Pacific 
Ocean 
Ocean sector sector 
sector 

1992 — 2004: 

Flux trend [mSv dec™] +449 -12 +11 -5+3  +21+10 
1992 — 2008: 

Flux trend [mSv dec™*] +1148 -5 +9 -2 42 +17 +8 
1982 — 2004: 

Flux trend [mSv dec™*] +845 -6 +5 -1 +1 +15 +6 
1982 — 2008: 

Flux trend [mSv dec™*] +945 -4+5 -1+1 +1445 
1982 — 2008 

Monte Carlo analysis: 

Flux offset 

before 1987 [mSv] +19 +5 +13 +7 +3 +2 +445 
Probability for trend 

of same sign [%] 100 92 78 100 
Probability for significant trend 

of same sign [%] 92 26 9 100 
Posterior trend uncertainty 
ImSv dec™] +5 +6 +2 +5 


Positive numbers indicate a northward freshwater transport trend (1 mSv per decade = 103m? s-! 


per decade). Bold numbers indicate a significance of the trend of at least 90% confidence using 
Student’s t-test. The Monte Carlo analysis is performed for 10,000 normally distributed sample 
offsets. Uncertainties (at the 95% confidence level) stem from the standard error of the slope of 
the regression line and the data uncertainty. See Methods for details. See Fig. 2 for the definition 


of the regions. 
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grassland diversity 
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Philip A. Fay’, Yann Hautier*, Helmut Hillebrand’, Andrew S. MacDougall!®, Eric W. Seabloom’, Ryan Williams", 


Jonathan D. Bakker!’, Marc W. Cadotte!’, Enrique J. Chaneton"™, 


Chengjin Chu, Elsa E. Cleland'®, Carla D’ Antonio’, 


Kendi F. Davies!®, Daniel S. Gruner!, Nicole Hagenah”°, Kevin Kirkman?°, Johannes M. H. Knops”!, Kimberly J. La Pierre”, 
Rebecca L. McCulley’, Joslin L. Moore‘, John W. Morgan”, Suzanne M. Prober?®, Anita C. Risch?’, Martin Schuetz’, 


Carly J. Stevens”® & Peter D. Wragg”? 


Niche dimensionality provides a general theoretical explanation 
for biodiversity—more niches, defined by more limiting factors, 
allow for more ways that species can coexist!. Because plant species 
compete for the same set of limiting resources, theory predicts 
that addition of a limiting resource eliminates potential trade-offs, 
reducing the number of species that can coexist”. Multiple nutrient 
limitation of plant production is common and therefore fertilization 
may reduce diversity by reducing the number or dimensionality of 
belowground limiting factors. At the same time, nutrient addition, 
by increasing biomass, should ultimately shift competition from 
belowground nutrients towards a one-dimensional competitive 
trade-off for light?. Here we show that plant species diversity 
decreased when a greater number of limiting nutrients were added 
across 45 grassland sites from a multi-continent experimental 
network‘, The number of added nutrients predicted diversity 
loss, even after controlling for effects of plant biomass, and even 
where biomass production was not nutrient-limited. We found that 
elevated resource supply reduced niche dimensionality and diversity 
and increased both productivity® and compositional turnover. Our 
results point to the importance of understanding dimensionality in 
ecological systems that are undergoing diversity loss in response to 
multiple global change factors. 

The search for the mechanisms underlying the coexistence of mul- 
tiple species was inspired by Darwin's observations of the problem of 
the ‘entangled bank, or how different checks on the growth of indi- 
viduals underlie the number of species found together®. One of the 
most general theoretical explanations for this problem is that greater 
dimensionality, or number of non-overlapping ecological niches, 
allows for the coexistence of a greater number of species)”. However, 
plant coexistence challenges this understanding: rather than occupy- 
ing unique resource niches, plants share and are limited by the same 
essential resources®, The coexistence of plants competing for the same 
resources therefore requires stoichiometric and physiological trade-off 


differences for shared limiting resources*. Furthermore, plant resources 
are spatially separated, with elemental nutrients (for example, nitrogen, 
phosphorus, potassium) and water acquired belowground and light 
aboveground. This suggests that two, non-independent resource- 
based mechanisms could maintain plant diversity: multi-dimensional 
trade-offs for belowground limiting nutrients, juxtaposed with a one- 
dimensional trade-off for light aboveground. 

Resource competition theory predicts that addition of a limiting 
resource makes that resource non-limiting, thereby eliminating a 
competitive trade-off contributing to coexistence”. Because some factor 
must ultimately limit growth, resource additions will lead to a reduction 
in the number and a shift in the identity of growth-limiting factors. 
In the case of plants, addition of multiple nutrients should reduce the 
dimensionality of belowground resource trade-offs, increase biomass 
production, and ultimately shift the prevailing form of resource compe- 
tition towards a single, aboveground limiting resource, light*°. Support 
for this hypothesis has been demonstrated in four grassland experi- 
ments. All of these experiments found plant biomass production was 
limited by multiple resources, and diversity decreased as a function of 
the number of belowground resources made non-limiting®”"!!. These 
results are consistent with the hypothesis that multi-dimensional trade- 
offs for belowground resources, and light competition mediated by 
aboveground biomass production, might jointly contribute to main- 
taining plant diversity in natural communities. Although multiple lim- 
itation of primary producer communities is common”, a recent global 
study demonstrated substantial site-level variation in the number and 
identity of co-limiting resources, with around 25% of sites showing no 
evidence that biomass production was nutrient limited'’. The question 
remains whether the dimensionality of nutrient resources might con- 
tribute to plant diversity independently of the presumed importance 
of indirect effects of biomass on diversity. 

Here we tested for loss of species diversity in response to multiple 
nutrient additions? using the Nutrient Network, a globally-distributed, 
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Figure 1 | Biodiversity and number of resources. a, Loss of species 
diversity with greater number of added resources (effective number 
of equally abundant species: ESNpyg); this effect increased with years 
of treatment 1-8 (Extended Data Table 1); year 0 shows pre-treatment 
diversity. Bold lines show overall mean responses of 45 sites; y axis is 
log-transformed. b, Greater number of added resources increased 


nutrient addition experiment, replicated across grassland sites on 
six continents (NutNet; http://www.nutnet.org)*. We added facto- 
rial combinations of phosphorus (P), nitrogen (N), and potassium 
(K,,; the K addition treatment included sulfur and a one-time addition 
of micronutrients; see Methods), with the aim of removing potential 
limitations from different combinations of the essential nutrient ele- 
ments that most strongly affect plant growth in natural and managed 
systems worldwide'*. Our treatments varied in the number of elemental 
resources they contained; hereafter, we use the term ‘number of added 
resources (1, 2 or 3) to represent the minimum number of potentially 
limiting elemental nutrients added (see Methods). 

If competition for multiple belowground resources contributes to 
species coexistence, then diversity should decrease as a function of 
the number of resources added. Species diversity decreased as more 
resources were added, and this effect increased with duration of treat- 
ment (Fig. 1a and Extended Data Table 1). Greater number of added 
resources increased the annual rate of diversity loss, even after con- 
trolling for differences in experiment duration (Fig. 1b). We found a 
similar proportional loss of diversity with a greater number of added 
resources (using the log-ratio effect size of treatment divided by control 
diversity; Fig. 1b), meaning that in terms of the number of potential 
species lost, relative diversity losses and annual rate of diversity loss 
were similar. Sites differed in the size of their species pools, which 
ranged from 13 to 103 observed species over a three-year period, and 
we found that the magnitude of diversity loss rate per added resource 
increased with local species pool size (Fig. Ic). 

We found that increasing the number of added resources increased 
live biomass (Fig. 2a), and decreased the proportion of photosyntheti- 
cally active radiation (PAR) transmitted through the canopy to the 
ground surface (Fig. 2b). Further, the amount of litter biomass, which 
can also contribute to light limitation and diversity loss'* increased 
with the number of added resources (Fig. 2c). Importantly, despite 
the complex causal effects of changes in multiple resources on the 
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Total number of species 
the mean rates of diversity loss per year (filled points; F\ 134 = 24.8, 
P<0.0001), and the proportional loss of species relative to the controls, 
shown as the effect size (open points; F},134 = 46.2, P< 0.0001). ¢, Rate of 
diversity loss per added resource (nres) was associated with greater total 
site species number (log), R?=0.25, P=0.0004, n= 45). Error bars show 
mean + 95% confidence intervals. 


relationship between diversity and biomass, the number of added 
resources remained a significant predictor of diversity loss, even after 
controlling for the potential contributing effects of species pool size, live 
biomass, total cover (a proxy for total plant abundance), light transmit- 
tance, and litter mass (Extended Data Tables 2 and 3). If species coexist 
though trade-offs in resource-ratio requirements, changes in below- 
ground resource supply could cause changes in competitive dominance 
and lead to species exclusion”, independent of aboveground effects 
of biomass. In a subset of sites that did not show a biomass response 
to multiple nutrient addition, we nevertheless observed declines in 
diversity consistent with this theory (Fig. 3a, b: open points, n= 11), 
similar to sites where biomass production was multiple-resource 
limited (Fig. 3a, b: filled points, n = 34). Overall, 14 sites of 45 sites in 
this study showed some type of negative biomass response to N, P or 
K,,, addition suggesting the potential for elevated nutrient concentra- 
tions supply to cause negative physiological responses in species not 
adapted to high nutrient concentrations’> or to large stoichiometric 
imbalances in resource supply’®. 

Diversity loss increased only weakly with biomass increase in plots 
receiving all three resources, providing some support for indirect effects 
of biomass as a contributing, but not a sole, mechanism of diversity loss 
due to fertilisation (Fig. 3c). If species losses were most strongly asso- 
ciated with biomass increases, we would expect the greatest effects on 
both responses to be associated with the same nutrient addition treat- 
ment, but this was true for only 22 of 45 cases (Chi-square, P < 0.0001). 
The loss of diversity was not driven by the addition of any single added 
resource (for example, N); greatest diversity loss occurred with the 
addition of a combination of two or more resources in 31 of 45 cases. 
These findings further highlight that biomass production and diver- 
sity can be controlled differently by multiple resources. Overall, these 
results support our conclusion that resource niche dimensionality can 
contribute to species diversity independently of indirect effects medi- 
ated by biomass production. 
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Figure 2 | Biomass and light. a, The rate of live biomass change per year 
increased with an increasing number of added resources (F193; = 55.0, 
P<0.0001). b, The proportion of photosynthetically active radiation 
(PAR) reaching the ground surface decreased with a greater number 
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of added resources, expressed as annual rate of change (F),7g2 = 62.4, 
P<0.0001). c, The mean rate of litter (dead biomass) change per year 
increased with the number of added resources (F733 = 4.37, P= 0.037). 
Error bars show mean + 95% confidence intervals. 
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Figure 3 | Multiple resource limitation. a, Increased number of added 
resources resulted in positive and increasing biomass at sites showing 
multiple resource limitation (filled points); sites not limited by multiple 
resources tended to show negative biomass responses with resource 
addition (open points). b, Increased number of added resources drove 


For resource dimensionality to contribute to species coexistence, 
species must trade-off their competitive abilities for different limiting 
resources, and changes in resource supply ratios should drive species 
compositional turnover”. We found that a greater number of added 
resources increased the compositional divergence from control plots 
(Fig. 4a). Plots receiving a single resource treatment (N, P and K,,, 
treatments) diverged as much from each other as they did on aver- 
age from the control plots (Fig. 4b), consistent with different species 
trading off competitive abilities for different resources”. We found 
that greater diversity loss was weakly associated with greater com- 
munity dissimilarity when all three resources were added together 
(Fig. 4c), suggesting that resource addition caused changes in com- 
munity composition that were not always associated with diversity 
loss. Both composition and diversity of communities contribute to 
ecosystem functioning, and many of the proposed mechanisms of the 
effect of species diversity on ecosystem function are resource-based’”. 
Additionally, nutrient enrichment impacts some groups of species 
more than others (for example, a loss of native species in favour of 
exotic grasses'®), Because changes in resource supply led to communi- 
ties of fewer species and of different compositions, we expect changes 
in resources, acting through diversity loss, to have both direct and 
indirect effects on ecosystem functions’. 

Although our results are consistent with predictions of the resource 
niche dimension hypothesis, they are also probably conservative. 
Our experimental design, a factorial manipulation of three resource 
treatments, represents a lower-bound estimate of the dimensionality 
of nutrient resources because our K,,, treatment included sulfur 
and up to 10 other macro- and micro-nutrients, of which more than 
one may have been limiting'*. Multiple chemical forms of a limiting 
nutrient can also contribute to species diversity”, further expanding 
potential resource dimensionality. Stronger tests of the role of multiple 
resource competition for structuring species coexistence require 
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Figure 4 | Community composition. a, Community composition diverged 
from control plots with greater number of added resources (Bray—Curtis 
dissimilarity index). Resource addition caused greater dissimilarity of 
community composition relative to mean pre-treatment dissimilarity, 
indicated by grey stars. b, Addition of single nutrient additions of N, P or 
K,,, resulted in communities that diverged as much from each other as 
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similar diversity loss at sites where biomass production was limited by 
multiple resources (filled points) and at sites where it was not (open 
points). c, Negative relationship between the effect of addition of 
three resources on biomass and diversity (one-tailed test for negative 
relationship, R?=0.11, P=0.012, n=45). Error bars show mean + s.e. 


physiological studies quantifying species-specific functional traits 
and trade-offs*', and testing whether species respond to resource 
treatments similarly in different environments. Deeper mechanistic 
insight can also be gained by asking how resource-dependent diversity 
patterns and mechanisms change across scales (for example, from 
local to regional) in response to global change drivers such as nutri- 
ent pollution”. Our results point to, but do not distinguish among, 
the presumed resource competition mechanisms’ that underlie the 
resource dimension hypothesis. 

We found that greater diversity loss was associated with soil P, K, 
pH and percentage sand, but not with soil N, or with latitude, or mean 
annual precipitation (Extended Data Table 4), suggesting that variation 
in soil properties may influence the degree to which communities 
respond to changes in resource availability”*. We did not test or control 
for other potential limiting factors such as herbivory or water, which 
can interact with nutrients in complex ways, and themselves contribute 
to species coexistence. For example, changes in nutrient availability 
affect photosynthetic tissue quantity and quality, and may alter the 
pattern and intensity of herbivory”, and the level of soil water deple- 
tion through transpiration losses. Our multi-year experimental results 
may still under-estimate nutrient effects when considering that global 
eutrophication represents a chronic and cumulative environmental 
change over many decades. Estimating effective upper bounds on eco- 
logically relevant resource dimensionality will depend on the degree 
to which multiple limiting factors covary, how they change in time 
and space, and how multiple limiting factors interact with each other 
in promoting coexistence. Global change is driving environmental 
conditions beyond multiple planetary boundaries”, and changing 
the limiting factors that structure species diversity*®. Understanding 
the mechanisms that underlie diversity loss caused by multiple global 
change factors is necessary to develop effective management strategies 
for restoring and preserving Earth's biodiversity. 
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they did on average from the control plots. Pre-treatment values indicated 
by grey stars. c, Negative relationship between the effect of addition 

of three resources on community dissimilarity relative to controls and 
diversity (one-tailed test for negative relationship, R?=0.10, P=0.019, 
n=45). Error bars indicate mean + 95% confidence intervals. 
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METHODS 


Data reporting. No statistical methods were used to predetermine sample size. 
The investigators were not blinded to allocation during experiments and outcome 
assessment. 

Experimental design. The Nutrient Network (NutNet) is a collaborative, distrib- 
uted experimental network", Sites are located across herbaceous terrestrial systems 
on six continents. Vegetation types represented include grasslands, savannas and 
meadows and occur across a wide range of climate and environmental factors 
(Supplementary Table 1). At the 45 sites (on five continents) with appropriate 
experimental data for these analyses, one year of pre-treatment (year 0) data 
were collected followed by at least 3 years and up to 8 years of treatment data. 
Individual site experiments share identical design and sampling protocols, with 
minor site-specific differences in terms of replication and treatment duration 
(Supplementary Table 1). We applied factorial combinations of nitrogen (N), 
phosphorus (P), and potassium plus micronutrients, designated here as the K,., 
treatment, giving eight treatment combinations including the control with no 
added resources. N was applied annually at 10gN m ? yr as time-release urea. 
Ammonium nitrate was used in 2007 at some sites before switching to urea due 
to restricted availability of ammonium nitrate; we found no differences in the 
short-term effects of alternative N sources in a separate experiment at four sites!®. 
P was applied at 10gP m ? yr! as triple-super phosphate, which also included 
Ca at 8.1g Cam ~ yr~!. The K,,, treatment added a mix of Kand $ (10gK m* yr? 
and 3.9gS m * yr! as potassium sulphate) and micronutrients (100 gm? yr“! of 
a mixture composed of 6% Ca, 3% Mg, 12% S, 0.1% B, 1% Cu, 17% Fe, 2.5% Mn, 
0.05% Mo, and 1% Zn). Micronutrients were only applied during the first treatment 
year to minimise potential for toxic metal accumulation. Plots were 5m x 5m and 
randomized within 1 to 6 blocks (Supplementary Table 1), with all eight treatment 
combinations occurring once per block. Sampling occurred at approximately peak 
biomass times for each site. 

Response measurements. Biodiversity estimates are scale-dependent”’, and 
increased resource availability can alter diversity-scaling relationships by changing 
the size of species pools and thus introduce differences in the coverage of sampling 
between treatments, due to larger and fewer individuals per area sampled”*, and 
contribute to the loss of rarer species. We calculated species diversity as the effective 
species number, which estimates the probability of interspecific-encounter if all 
species are equally abundant (ESNpyz). ESNpyz has been shown to be less sensitive 
to scaling issues than other metrics”, and is representative of the maximum slope 
of the species-area accumulation function. We used ESNpyz, because NutNet sites 
vary in their species pools and therefore their species accumulation curves will 
differ, which creates a challenge to compare species diversity when sampled at a 
fixed area’*. ESNprg has been shown to be relatively insensitive to such sampling 
area issues because it essentially measures the maximum change in species number 
as a function of sampling area (that is, the slope at the x intercept of the species 
accumulation curve). Because the resource dimension hypothesis and underlying 
resource ratio theory assume that species trade-off for different limiting factors, 
predictions for diversity change describe changes in competitive dominance; 
ESNpig captures these predicted changes in dominance better than simple meas- 
urements of local species extinction (that is, richness loss). We used the aggregate 
number of species observed at a site as an estimate of the asymptote of the species 
accumulation function, and of the regional species pool. We also used simply the 
number of species (that is, richness) and found similar results as those using ESNprg 
(Extended Data Table 1). 

We measured species diversity annually by estimating the % cover of each plant 
species within a 1m x 1 m fixed location in each plot; the total cover typically 
summed to greater than 100% due to multiple canopy layers. We quantified species 
diversity as the probability of interspecific encounter (PIE), or effective species 
number (ESNpyg), assuming species relative abundances are equal: 


ESNpig = 


1 
Ys p2 () 


Pi 
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where p; is the proportion of species i in a community of size s; ESNprp is derived 
from the inverse of Simpson's diversity index””. 

We measured aboveground live biomass by clipping two 1m x 10cm strips of 
vegetation in each plot, sorting the sampled tissue to live (current year’s produc- 
tion) and dead (previous years’ production) fractions, drying at 60°C for 48h and 
weighing. At most sites, photosynthetically active radiation PAR was measured 
above the plant canopy and at the ground surface and the proportion of transmitted 
light calculated. 

We categorised plant communities at sites as multiple-resource limited if 
biomass responded positively to fertilisation with combinations of different 
nutrients. Specifically, we designated sites as ‘multiple-resource limited’ if biomass 
increased with the independent addition of different resources or if biomass 
responded synergistically to two or more added resources (that is, the response to 
one nutrient was dependent on the level of another and their combined effect was 
super-additive)''. Sites that showed no response or negative biomass response or 
responded positively to only one resource we categorised as not multiple-resource 
limited. Thirty-four of the 45 sites showed increased biomass in response to 
multiple added resources; eight did not respond positively to resource addition, and 
three responded positively to a single resource (that is, single resource limited"). 
Statistical analysis. All analyses used R version 3.2.2. We used linear mixed-effects 
models (R package Ime) to test the interaction of number of added resources and 
the number of treatment years, on diversity (ESNpyg) and richness. Site and block 
were modelled as nested random effects. We included in the model an autocorre- 
lation structure, a first-order autoregressive model (AR(1)), where observations 
are expected to be correlated from one year to the next, and found a substantial 
improvement in model fit when we compared this model to a model with no auto- 
correlation structure (lower AIC = A 608 and likelihood ratio tests, L.Ratio=610, 
P<0.0001)”*. Treatment effects increased in magnitude with time (significantly 
negative interaction between number of added resources and year; Supplementary 
Table 2). To allow standardized comparison of sites that differed in the year they 
were established and in duration of nutrient addition, we used two approaches to 
quantify the changes in species diversity. First, we calculated the annual rate of 
change of our response variables to standardise site responses. Second, for analyses 
that required an effect size, calculated as the log ratio of the treatment response 
divided by the control, we used the most recent year of treatment data, which 
ranged from 3 to 8 years of annual nutrient application duration (Supplementary 
Table 1). Log ratio effect size estimates would not have been possible using the 
rate of change estimates, which can take zero or negative values. Log ratio effect 
sizes tend to be normally distributed, centre zero effects (control levels) at zero 
log ratios, and scale responses to make proportional effects directly comparable 
between sites*”. 

We used linear mixed-effects models (R package Ime) to test the effects of num- 
ber of treatment years, site richness, log live biomass, log dead biomass, PAR, total 
species cover, and the number of added resources on diversity (ESNpie), with plot 
nested in block nested in site as random effects. Models using dead biomass and 
PAR used the subset of 32 sites for which we had data for these variables. We calcu- 
lated mean values at each site for the annual rate of diversity loss and diversity effect 
size, and tested for linear relationships between these variables and the number 
of added resources using regression with site as a block term. We used step-wise 
linear regression and AIC criteria to test for relationships of loss of diversity (from 
addition of three resources) with latitude, longitude, and environmental covariates 
of mean annual precipitation, and soil N, P, K, pH, percentage clay, and percentage 
sand. Plant community composition changes were quantified using Bray-Curtis 
multivariate distances (R package vegan). 
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in experimental ecology. Ecology 80, 1150-1156 (1999). 
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Extended Data Table 1 | The effects of nutrient addition on diversity loss and richness loss increase with time 


ESNpie Num. DF Den. DF F P 

intercept 1 6555 370.3 <0.0001 
year 1 6555 39.3 <0.0001 
nres 1 1049 32.2 <0.0001 
year x nres 1 6555 26.1 <0.0001 
Richness Num. DF value SE P 

intercept 1 6555 146.4 <0.0001 
year 1 6555 209.1 <0.0001 
res 1 1049 91.8 <0.0001 
year x nres 1 6555 33.5 <0.0001 


Linear mixed-effects model of the effects of number of treatment years (ARIMA type-1 autocorrelation) and the number of added resources on diversity (log ESNpie) and richness, with plot nested in 
block, nested in year, nested in site, as random effects, using all 45 sites. There was a significant, negative interaction between the number of added resources (nres) and year of treatment (year). 
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Extended Data Table 2 | The number of added resources predicts diversity loss after controlling for other variables 


Num. DF Den. DF 
intercept 1 1029 
years of treatment 1 42 
site richness 1 42 
log live biomass 1 1029 
total cover 1 1029 
number of added resources 1 1029 


F 


329.9 


4.8 


35.4 
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P 


<0.0001 


0.0012 


<0.0001 


<0.0001 


0.029 


<0.0001 


Linear mixed-effects model of the effects of number of treatment years, site richness, log live biomass, total species cover, and the number of added resources on diversity (ESNpie), with plot nested in 
block nested in site as random effects, using all 45 sites and data from the maximum treatment year for each site. A AIC between model with number of added resources and model without was 33, 


log-likelihood ratio 35.0, P< 0.0001. 
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Extended Data Table 3 | The number of added resources is an important predictor even after controlling for other variables, for sites that 
had light and litter data 


Num. DF Den. DF i P 
intercept 1 643 285.3 <0.0001 
years of treatment 1 29 14.1 0.0008 
site richness 1 29 25.7 <0.0001 
log live biomass 1 643 7.9 0.0052 
total cover 1 643 4.5 0.034 
log dead biomass 1 643 0.34 0.56 
PAR 1 643 18.2 <0.0001 
number of added resources 1 643 15.6 0.0001 


Linear mixed-effects model of the effects of number of treatment years, site richness, log live biomass, log dead biomass, PAR, total species cover, and the number of added resources on diversity 
(ESNpie), with plot nested in block nested in site as random effects, using data from the maximum treatment year for each site, and the subset of 32 sites for which there was dead biomass and PAR 
data. A AIC between model with number of added resources and model without was 15, log-likelihood ratio 15.6, P< 0.0001. 
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Extended Data Table 4 | Diversity loss due to addition of nutrients associated with soil properities 


DF ss 
soil P 1 0.16 
soil K 1 0.018 
pH 1 0.46 
% sand 1 0.73 
residuals 25 2.28 


F 


1.72 


0.20 


5.03 


8.05 


LETTER 


Stepwise multiple regression (backward with AIC criteria for model comparisons) retained soil P, K, pH, and percentage sand as predictors of diversity loss from the addition of three resources, for 
the 30 sites with soil analysis data (excluding one site for extreme value of P). The variables latitude, longitude, mean annual precipitation, and soil percentage N were not retained. Overall model is 


significant (r2 = 0.375, Fa25=3.75, P=0.016). 
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Serotonin engages an anxiety and fear-promoting 
circuit in the extended amygdala 


Catherine A. Marcinkiewcz!*, Christopher M. Mazzone!**, Giuseppe D’ Agostino’, Lindsay R. Halladay‘, J. Andrew Hardaway, 
Jeffrey F. DiBerto!, Montserrat Navarro®, Nathan Burnham”, Claudia Cristiano, Cayce E. Dorrier!, Gregory J. Tipton!, 
Charu Ramakrishnan®, Tamas Kozicz”*, Karl Deisseroth®, Todd E. Thiele!*, Zoe A. McElligott!°, Andrew Holmes‘, 


Lora K. Heisler? & Thomas L. Kash!2):!0 


Serotonin (also known as 5-hydroxytryptamine (5-HT)) is a 
neurotransmitter that has an essential role in the regulation of 
emotion. However, the precise circuits have not yet been defined 
through which aversive states are orchestrated by 5-HT. Here we 
show that 5-HT from the dorsal raphe nucleus (5-HTP®) enhances 
fear and anxiety and activates a subpopulation of corticotropin- 
releasing factor (CRF) neurons in the bed nucleus of the stria 
terminalis (CRF®N*%‘) in mice. Specifically, 5-HTP®N projections 
to the BNST, via actions at 5-HT¢ receptors (5-HT2cRs), engage 
a CRF®NST inhibitory microcircuit that silences anxiolytic BNST 
outputs to the ventral tegmental area and lateral hypothalamus. 
Furthermore, we demonstrate that this CRF®NST inhibitory circuit 
underlies aversive behaviour following acute exposure to selective 
serotonin reuptake inhibitors (SSRIs). This early aversive effect is 
mediated via the corticotrophin-releasing factor type 1 receptor 
(CRF,R, also known as CRHR1), given that CRF)R antagonism is 
sufficient to prevent acute SSRI-induced enhancements in aversive 
learning. These results reveal an essential 5-HT?P®N—>CRF5NST 
circuit governing fear and anxiety, and provide a potential 
mechanistic explanation for the clinical observation of early adverse 
events to SSRI treatment in some patients with anxiety disorders! . 

Give the multiple converging lines of evidence pinpointing 5-HT as 
a critical neuromodulator of pathological fear learning*, we first inter- 
rogated the endogenous recruitment of the 5-HT?®\ “ST circuit by an 
aversive footshock stimulus in mice. Using Fluoro-Gold to retrogradely 
label BNST-projecting 5-HT neurons in the dorsal raphe nucleus 
(DRN), we found that c-fos, an immediate-early gene indicative of 
in vivo neuronal activation, was significantly elevated in 5-HT?®\~BNST 
neurons after footshock (Fig. 1la-f). Using in vivo electrophysiology, 
we then probed the neuronal dynamics of the BNST during fear con- 
ditioning and recall, and found evidence for engagement during both 
conditioning and recall (Extended Data Fig. 1). 

To decipher the role of this 5-HTP®N~®°" circuit in aversive behav- 
iour, Channelrhodopsin2 (ChR2)-eYFP was selectively expressed in 
5-HTP®N neurons through the delivery of a Cre-inducible viral vector 
in mice expressing Cre recombinase under the control of a serotonin 
transporter promoter (Sert@”) (Sert is also known as Slc6a4) for both 
in vivo and ex vivo analysis. We observed eYFP* (5-HT) cell bodies in 
the DRN and eYFP* fibres in both the dorsal and ventral aspects of the 
BNST (Sert@::ChR2?8N~8NST), confirming a direct projection of 5-HT 
neurons originating in the DRN to the BNST (Fig. 1g, h)°. Optical stim- 
ulation of these fibres in BNST slices evoked 5-HT release, as measured 
by fast-scan cyclic voltammetry (FSCV) (Fig. 1i, j). Furthermore, bath 


application of the SSRI fluoxetine reliably decreased the rate of 5-HT 
reuptake, confirming that photostimulation of SERT* terminals in the 
BNST originating from the DRN induces 5-HT release (Fig. 1k, 1). 
We examined whether this 5-HT?®N~®S? circuit is function- 
ally relevant for fear and anxiety-like behaviour. To investigate this, 
Sert@*::ChR2PRN~BNST mice were implanted with bilateral optical 
fibres and photostimulated in the BNST (473 nm, 20 Hz) using a stand- 
ard tone-shock fear conditioning paradigm. Optogenetic stimulation of 
this pathway was paired with a tone that co-terminated with a scram- 
bled footshock. Cued fear was assessed 24h after, and contextual fear 
48h after, the initial fear acquisition session (Fig. 1m, n). Although no 
changes were observed during fear acquisition, both cued and con- 
textual fear recall were significantly heightened in photostimulated 
Sert©::ChR2PRN~BNST mice (Fig. lo-q). We next assessed anxiety-like 
behaviour using well-characterized assays: the elevated plus maze (EPM) 
and novelty-suppressed feeding (NSF) tests. Upon stimulation with light, 
Sert::ChR2PRN~BNST mice exhibited enhanced anxiety-like behaviour 
in both the EPM and NSF tests (Fig. 11, sand Extended Data Fig. 2a, b). 
Importantly, photostimulation did not induce hypolocomotion in the 
EPM or open field tests, nor did it alter home-cage feeding, thus con- 
firming that hypophagia in the NSF assay was due to anxiety and not a 
reduction in appetitive drive (Extended Data Fig. 2c—e). One potential 
explanation of these results is that terminal stimulation in the BNST pro- 
duces antidromic spikes in DRN cell bodies that release 5-HT in other 
brain regions, which could be also be driving these behaviours. Therefore 
we probed the mechanism more deeply using converging approaches. 
To determine a receptor target through which 5-HT is signalling in the 
BNST, we then examined the impact of optogenetically evoked 5-HT?®N 
release on postsynaptic neuronal excitability and found a 3.05 + 0.59 mV 
depolarization that was blocked by a 5-HT2cR antagonist (Fig. It, u). 
In contrast to previous reports demonstrating co-release of 5-HT and 
glutamate from DRN projections to the nucleus accumbens*, we did 
not observe any time-locked light-evoked EPSCs in the BNST (data not 
shown). These results indicate that 5-HTP®N “NST projections have a 
predominantly excitatory effect that is dependent on 5-HT»cR signal- 
ling. To examine the role of 5- HT 2cR containing neurons in anxiety-like 
behaviour, we took advantage of a Hi tr2c? mouse line (Extended Data 
Fig. 3a, b)’. Using ‘designer receptors exclusively activated by designer 
drugs’ (DREADDs) that are coupled to the Gag signalling pathway 
(hM3Dq-DREADD)*, we found that activation of G, signalling in 
5-HT>cR-expressing neurons in the BNST significantly delayed the onset 
of feeding in the NSF assay without affecting home cage feeding behav- 
iour (Extended Data Fig. 3c-g), thus phenocopying the effect observed 
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Figure 1 | Optogenetic identification of a 5-HTP®\~®NST projection that 
elicits anxiety and fear-related behaviour. a, Experimental timeline for c-fos 
labelling of 5-HTPRN~8NST neurons following an aversive footshock stimulus. 
b, Representative images of Fluoro-Gold (FG, blue), tryptophan hydroxylase 
(TPH, violet), and c-fos (green) staining in the DRN for 13 mice. Scale bars, 
100m. c-f, Histograms depicting the number of double- and triple-labelled 
neurons in the DRN of naive and shocked mice. c, There were no significant 
differences in the number of BNST-projecting 5-HT?®% neurons between 
groups. d-f, Footshock lead to significant elevations in the number of c-fost 
‘activated’ 5-HT neurons (t;; =2.975, P< 0.05, Student's unpaired two-tailed 
t-test, n =7 naive and n=6 shocked mice), c-fos*, Fluoro-Gold-labelled 
neurons (t); = 2.836, P< 0.05, Student’s unpaired two-tailed t-test, n=7 
naive and n=6 shocked mice), and triple-labelled neurons (tf); = 2.374, 
P<0.05, Student’s unpaired two-tailed t-test, n=7 naive and n=6 shocked 
mice). g, Experimental configuration for light-evoked FSCV experiments 

in Sert\"::ChR2P8N~BNST mice. h, Coronal images showing ChR2-YFP 
expression in the soma of the DRN and axons of the BNST. Scale bars, 500j1m. 
i, Representative colour plot of 5-HT release to optical stimulation (blue bar, 
20 Hz, 20 pulses) for 3 mice. j, Representative cyclic voltammogram at peak 
5-HT (black dashed line) for 3 mice. k, Representative current versus time 


with 5-HTPN—8NST fibre stimulation during NSE Together, these results 
provide converging evidence that activation of 5-HTP®N~8NS? inputs 
elicits anxiety-like behaviour via 5-HT>c¢R signalling. 
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trace at baseline (black) and following 10,.M fluoxetine (red) for 3 mice. 

1, Clearance half-life of 5-HT at baseline (white) and following 101.M 
fluoxetine (red). t, = 8.43, P< 0.05, Student's paired two-tailed t-test, n =3 
slices from 3 mice. m, Sert“® mice were transduced in the DRN and implanted 
with bilateral optical fibres in the BNST. n, Schematic of fear conditioning 
procedures in Sert@::ChR2P8N~5NST mice. o-q, Photostimulation during 
fear acquisition had no effect on freezing behaviour during fear learning 

but increased freezing during cued (t,7 = 2.436, P< 0.05, Student’s unpaired 
two-tailed t-test, m= 10 control, n =9 ChR2) and contextual fear recall 

(ti7 = 2.271, P< 0.05, Student’s unpaired two-tailed t-test, n = 10 control, 
n=9 ChR2). r, Light delivery to the BNST reduced open arm time in the 
EPM (t)5=2.79, P< 0.05, Student's unpaired two-tailed t-test, n = 8 control, 
n=9 ChR2). s, Increased latency to feed in the NSF (t)7=2.19, P< 0.05, 
Student's unpaired two-tailed t-test, n =9 control, n= 10 ChR2). t, Action 
potentials generated by photostimulation in the DRN (5 Hz (top), 10 Hz 
(middle), 20 Hz (bottom), 473 nm). u, Depolarization in cells (tg = 5.20, 
P<0.01, one-sample t-test, n =9 cells from 4 mice) after photostimulation 
in the BNST (5 Hz, 10s, 473 nm) and blockade of this response by 51M 
RS-102221 (5-HT2cR antagonist) (t,= 2.5, P> 0.05, one-sample t-test, n=5 
cells from 2 mice). Data are mean +s.e.m. *P < 0.05; ***P< 0.001. 


We considered the neurochemical phenotype of these target 
5-HTPRN5-HT>cRENST neurons and hypothesized that 5-HT via 
5-HT CR modulates the activity of neurons expressing the neuropeptide 
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CRE. This hypothesis was based on a previous analysis of 5-HT2cR 
knockout mice, which exhibit an anxiolytic phenotype associated with 
a reduction of c-fos in CRF®N*! neurons’. Initially, using CRF reporter 
mice to a priori select CRF neurons for recordings, we found a hetero- 
geneous 5-HT-induced response in CRF®N*! neurons (Extended Data 
Fig. 4a), with only a subset demonstrating a depolarization. Consistent 
with this, double fluorescence in situ hybridization revealed that only 
a subset of CRF neurons within the dorsal BNST (~70%) and ventral 
BNST (~43%) express 5-HT2cRs (Extended Data Fig. 4b-d). 
Although CRE signalling within the BNST is associated with anxiety- 
like behaviour!®!!, more recent studies using circuit-based tools have 
found that optogenetic stimulation of GABAergic projections (which 
include CRF®N*T neurons) to the ventral tegmental area (VTA) are 
anxiolytic’”. This led us to hypothesize the existence of functionally 
distinct subsets of CRF®NST neurons that gate different behaviours 
and are differentially sensitive to 5-HT. We used fluorescent retro- 
grade tracer beads to label CRF®*? neurons as VTA-projecting or 
non-VTA-projecting (Fig. 2a), and found that VTA-projecting CRF 
neurons (CRF8NST~V7A neurons) were hyperpolarized by an average 
of 5.73 + 1.24mV and non-VTA-projecting CRF neurons were depo- 
larized by an average of 2.74 +£0.39 mV during 5-HT bath application. 
Moreover, the excitatory response to 5-HT in non-VTA-projecting CRF 
neurons was reversed in the presence of a 5-HT¢ receptor antagonist 
(Fig. 2b). Furthermore, all CRF®NST—VTA neurons were non-responsive 
to the 5-HT2R agonist meta-chlorophenylpiperazine (mCPP), whereas 
all non-VTA projecting CRF neurons were depolarized by mCPP by an 
average of 3.26 +0.74mV (Extended Data Fig. 4e-h). These findings 
suggest an anatomically distinct response to 5-HT by different sub- 
sets of CRF®NST neurons. The subset of CRF®\S? neurons expressing 
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Figure 2 | Serotonin activates a local population of CRF®NST neurons 
that inhibits outputs to the midbrain. a, Recording scheme for CRF 
reporter mice injected with retrograde tracer beads in the VTA. b, 5-HT 
depolarizes local CRF neurons (ts = 7.06, P< 0.001, one-sample t-test, 
n=6 cells from 4 mice) in the BNST while hyperpolarizing CRF®NST~V1™ 
neurons (tg = 4.64, P< 0.01, one-sample t-test, n =7 cells from 6 mice). 
Non-VTA-projecting CRF neurons are hyperpolarized by 5-HT in the 
presence of the 5-HT cR antagonist RS-102221 (t4= 4.74, P< 0.01, one- 
sample t-test, n =5 cells from 3 mice). c, Top and middle, schematic 
depicting infusions and recording configuration for Crf“":ChR22NS! mice 
injected with retrograde tracer beads in the VTA. Bottom, representative 
trace of light-evoked IPSC in beaded (that is, VTA projecting), non-ChR2 
expressing neurons in the BNST of Crf“®:ChR2 mice with retrograde 
tracer beads in the VTA (n =8 cells from 3 mice) and blockade of this 
response by GABAzine (F1,33 = 53.16, P< 0.001, repeated measures one- 
way ANOVA, n=4 cells from 3 mice). d, Recording scheme for C57BL/6 
mice with retrograde tracer beads in the VTA or LH. e, Representative 
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5-HT 2cRs do not project to the VTA and are depolarized by 5-HT, 
whereas the CRF®NST V7 neurons are hyperpolarized by 5-HT, via 
actions at another 5-HT receptor. 

To determine if this 5-HT-dependent mechanism extended to other 
anxiolytic efferents, we injected retrograde tracer beads into the lateral 
hypothalamus (LH) of CRF reporter mice and found 5-HT had similar 
bidirectional effects on non-LH-projecting and LH-projecting CRF®NS? 
neurons (Extended Data Fig. 5a-c). Noting the functional similarities 
between these two populations, we used retrograde tracing to deter- 
mine that roughly ~58% of CRF®NST neurons have projections to the 
LH or VTA (Extended Data Fig. 5d-f). Notably, ~20-31% of these 
CRF®NST output neurons form parallel projections to these structures. 

In light of recent reports that CRF®%°" neurons are exclusively 
GABAergic!>, we hypothesized that non- VTA-projecting CRFBNS? 
neurons may locally inhibit BNST— VTA neurons to promote fear 
and anxiety. To test this hypothesis, we injected Crf(” mice with a 
Cre-inducible ChR2 into the BNST and retrograde tracer beads into the 
VTA. We then recorded light-evoked inhibitory postsynaptic potentials 
(IPSCs) from non-ChR2 (ChR2-negative, retrograde tracer-positive) 
VTA-projecting BNST neurons (Fig. 2c). Photostimulation produced 
action potentials in CRF®*' neurons and light-evoked IPSCs in non- 
ChR2 VTA-projecting neurons, indicating that CRF®NST neurons form 
local GABAergic synapses with BNST neurons that project to the VTA. 
Repeating these same experiments in Crf::ChR22S! mice with retro- 
grade tracer beads in the LH, we found that we could evoke GABA cur- 
rents using photostimulation in LH-projecting neurons as well (Extended 
Data Fig. 5g-i). Moreover, we observed that 5-HT increased GABAergic 
transmission on to BNST—VTA projecting neurons in a tetrodotoxin 
and 5-HT2cR antagonist dependent manner (Fig. 2d-f and Extended 
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traces of sIPSCs in BNST neurons that project to the VTA before and after 
5-HT application for 5 cells from 4 mice. f, Bar graphs showing magnitude 
of 5-HT effect on average sIPSC frequency in BNST neurons that project 
to the VTA (t4= 3.257, P< 0.05, one-sample t-test, n =5 cells from 

4 mice) and in BNST neurons that project to the LH (tf; = 3.027, P< 0.05, 
one-sample t-test, n = 6 cells from 3 mice) and blockade of these responses 
by tetrodotoxin (TTX) and RS-102221. Effects on amplitude were non- 
significant. g, Experimental scheme for experiments with Crf©::Intrsect- 
ChR2®%STmnice. h, i, 5-HT significantly depolarizes non-projecting CRF 
(Intrsect) neurons in the BNST (tg = 2.501, P< 0.05, one-sample t-test, 
n=7 cells from 5 mice) and produces a significant change in membrane 
potential in CRF Intrsect neurons compared to all CRF neurons (ty6 = 2.08, 
P<0.05, Student’s unpaired two-tailed t-test, n =21 cells from 14 mice 

for experiments in all CRF neurons and n=7 cells from 5 mice for 
Crf*::Intrsect-ChR22NSt experiments). Data are mean + s.e.m. *P < 0.05; 
**P < 0.01; ***P < 0.001. § denotes P < 0.05 for the Student’s unpaired 
two-tailed t-test between all CRF neurons and CRF Intrsect neurons in h. 
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Data Fig. 5j-n). Similar effects of 5- HT on GABAergic transmission were 
found in BNST—LH projecting neurons (Extended Data Fig. 50-v). 
Furthermore, slice recordings in a CRF reporter line indicates that 5-HT 
does not increase GABAergic transmission on to the general population 
of CRF®S neurons nor does it directly excite non-CRF VTA project- 
ing neurons (Extended Data Fig. 6). The 5-HT2R agonist mCPP also 
increased GABAergic but not glutamatergic transmission in the BNST 
(Extended Data Fig. 7). Finally, to test if optically evoked 5-HT can inhibit 
BNST outputs to the VTA, we performed slice recordings in the BNST 
of Sert@::ChR2PRN~8NST mice and found that brief photostimulation of 
5-HT terminals in the BNST increased spontaneous IPSCs (sIPSCs) on 
to VTA projecting BNST neurons in a manner similar to bath-applied 
5-HT (Extended Data Fig. 8a—c). Together, these experiments indicate 
that CRF®NST neurons inhibit at least two major BNST outputs to the 
VTA and LH that are reported to be anxiolytic'*"4, providing mechanistic 
insight into the aversive actions of 5-HT signalling in the BNST. 

We took advantage of a new combinatorial strategy called INTronic 
Recombinase Sites Enabling Combinatorial Targeting or INTRSECT’* that 
allows for direct visualization of these non-projecting, putatively local 
CRF®NST neurons in the BNST. By coupling retrograde Cre-dependent 
flippases (HSV-LSL1-mCherry-IRES-flpo) in the VTA and LH with a 
(Creon/flp o¢)-Chr2-eYFP viral construct in the BNST of Crf*”* mice 
(Crf°”::Intrsect-ChR2®N‘! mice), we were able to genetically isolate 
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Figure 3 | Acute fluoxetine elicits aversive behaviour by engaging 
inhibitory CRF circuits in the BNST. a, Schematic of recording for in vivo 
fluoxetine experiments in CRF reporter mice. b, Representative traces of 
sIPSCs in VTA-projecting neurons in the BNST for 5 experiments in 2 
saline-treated mice and 7 experiments in 2 fluoxetine-treated mice. 

c, d, Bar graphs showing that fluoxetine increases in sIPSC frequency 

(tip = 2.55, P<0.05, Student’s unpaired two-tailed t-test, n=5 cells from 

2 saline-treated mice, n =7 cells from 2 fluoxetine-treated mice), but 

not amplitude (tj) = 0.4752, P > 0.05, Student’s unpaired two-tailed 

t-test, n=5 cells from 2 saline mice, n =7 cells from 2 fluoxetine mice) in 
VTA-projecting neurons in the BNST. e, Experimental configuration for 
assessment of anxiety in fluoxetine-treated Crf©":hM4Di®NS! (Gi-coupled 
DREADD) mice and a coronal slice of the BNST expressing hM4Di-mCherry. 
Scale bar, 100,1m. f, Confirmatory electrophysiology in the BNST showing 
hyperpolarization of hM4Di-mCherry-expressing cells following bath 
application of CNO (ts = 4.32, P< 0.01, one-sample t-test, n = 6 cells from 


100 | NATURE | VOL 537 | 1 SEPTEMBER 2016 


non-V TA/LH-projecting CRF neurons in the BNST. We also infused 
a Cre-dependent HSV-mCherry vector in a subset of Crf“::Intrsect- 
ChR2®NST mice as a control. In HSV-flpo infused Crf©”*::Intrsect- 
ChR25NST mice, we observed a significant reduction in YF P* cells in 
the ventral BNST (Extended Data Fig. 8d-f), indicating that a large 
proportion of VTA-projecting and LH-projecting CRF®NS neurons are 
located in the ventral BNST: We also found that 5-HT robustly depolar- 
ized these Crf\”::Intrsect-ChR22N‘! neurons compared to CRF neurons 
in general (Fig. 2g-i). Furthermore, we observed light-evoked IPSCs in 
the BNST of Crf(”::Intrsect-ChR2®N* mice, confirming local GABA 
release from these neurons (Extended Data Fig. 8g). These results sup- 
port the existence of a separate population of local CRF®NST neurons 
that is excited by 5-HT and increases local GABAergic transmission in 
the BNST, distinct from a population of CRF®N*T neurons that project 
to and release GABA in the VTA or the LH (Extended Data Fig. 8h-j). 

To probe the translational relevance of these BNST microcircuits, 
we adopted a pharmacological approach using SSRIs. SSRIs represent 
one of the most widely used classes of drugs for psychiatric disorders. 
One limitation of SSRIs is that acute administration can lead to neg- 
ative behavioural states! ?, a finding that is recapitulated in rodent 
models*!*~°. Importantly, the BNST has been demonstrated to be an 
anatomical site of action for some of the aversive actions of SSRIs in 
rodents*. This provided the opportunity to test our model that 5-HT 
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4 mice). g, h, Chemogenetic silencing of CRF neurons attenuates fluoxetine- 
induced anxiety like behaviour on the elevated zero maze (F},39 = 7.086, 
P<0.05, two-way ANOVA, n= 10 fluoxetine and hM4Di and n=8 for all 
other groups) without any concomitant locomotor effects. i, Experimental 
configuration for fear conditioning experiments in Crf(::hM4DiPN*? mice. 
j, k, Chemogenetic silencing of CRF®NST neurons had no effect on freezing 
behaviour during fear learning but prevented fluoxetine enhancement of 
cued fear recall (F},17 = 8.73, P< 0.01, two-way ANOVA, n= 6 mCherry 
and vehicle and n=5 per group for all other groups). 1, Experimental 
configuration for assessment of the role of BNST outputs to the VTA and LH 
in fluoxetine-induced aversive behaviour. m, Confocal image of the BNST 
from HSV“?::;hM3Dq2N*F mice. Scale bars, 500 1m. n, 0, Chemogenetic 
activation of BNST neurons that project to the midbrain did not impact fear 
acquisition but attenuated fluoxetine-induced enhancement of cued fear 
recall (F\,.7=7.541, P< 0.05, two-way ANOVA, n=7 vehicle/hM3D and 
n=8 for all other groups). Data are mean + s.e.m. *P < 0.05; **P<0.01. 
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in the BNST drives aversive behaviour through inhibition of BNST 
outputs to the VTA. We observed that an acute systemic injection of 
the SSRI fluoxetine increased GABAergic transmission on to VTA pro- 
jecting neurons in the BNST (Fig. 3a—d). We then interrogated the role 
of CRF®SST neurons in acute fluoxetine-enhanced anxiety using Crf~” 
mice transduced in the BNST using a Cre-inducible DREADD cou- 
pled to the Gai signalling pathway (hM4Di-DREADD). We found that 
acute fluoxetine potentiated anxiety-like behaviour, and this effect was 
blocked by chemogenetic inhibition of CRF®S" neurons (Fig. 3e-h). 

To evaluate directly whether endogenous 5-HT acts on CRFBNST 
neurons to enhance cued fear memory, we used the same chemoge- 
netic approach to silence CRF®N*' neurons during fluoxetine treatment 
and subsequent fear conditioning (Fig. 3i). Chemogenetic inhibition 
of CRF®NST neurons also significantly attenuated fluoxetine-induced 
enhancement of cued fear recall, providing proof of concept that aug- 
mentation of 5-HT via acute SSRI treatment recruits CRF®NS neurons 
to enhance fear-related behaviour (Fig. 3), k). Using connectivity based 
chemogenetic approaches, we then tested whether inhibition of BNST 
outputs to the VTA and LH is a critical component of 5-HT—BNST- 
induced aversive states. We observed that activation of G, signalling in 
VTA-projecting and LH-projecting BNST neurons, targeted by HSV- 
Cre-eYFP infused in the VTA and LH and Cre-dependent Gq-coupled 
DREADD infused in the BNST (HSV“*::hM3Dq®N°"), significantly 
attenuated fluoxetine enhancement of cued fear recall (Fig. 3l-o). 
Together, these data provide compelling evidence that acute fluoxetine 
engenders aversive behaviour by recruiting CRF neurons in the BNST 
that in turn inhibit putative GABAergic (anxiolytic and stress buffering) 
outputs from the BNST to the VTA and LH. Pharmacological interven- 
tions that target this circuit may improve adverse symptoms during the 
initial weeks of SSRI treatment. Based on the critical role for CRF2NST 
neurons in fluoxetine-induced aversive behaviour, we examined the 
effect of a systemic CRF;R antagonist on SSRI enhancement of cued 
fear recall. Blocking the CRF system reduced this aversive state and 
abolished the increase in sIPSCs in LH-projecting neurons in the BNST 
during bath application of 5-HT (Extended Data Fig. 9). This provides 
preclinical evidence that CRF|R antagonists given in concert with 
SSRIs could be a promising treatment for anxiety disorders. 

Together, these data reveal a discrete 5-HT responsive circuit in the 
BNST that underlies pathological anxiety and fear associated with a 
hyperserotonergic state (Extended Data Fig. 10). SSRIs are currently a 
first-line treatment for anxiety and panic disorders, but can acutely exac- 
erbate symptoms, resulting in poor therapeutic compliance. Our results 
strongly implicate 5-HT engagement of a local BNST-inhibitory micro- 
circuit in acute SSRI-induced aversive behaviours in rodents, and could 
potentially be involved in the early adverse events seen in clinical pop- 
ulations, emphasizing the need to identify compounds that selectively 
target both genetically defined and pathway-specific cell populations. 


Online Content Methods, along with any additional Extended Data display items and 
Source Data, are available in the online version of the paper; references unique to 
these sections appear only in the online paper. 
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METHODS 


Data reporting. Based on power analyses that assumed a normal distribution, a 
20% change in mean and 15% variation, we determined that at least 9 mice per 
group would be needed for behavioral experiments. This was adhered to as far 
as possible, except in cases where mice had to be removed owing to misplaced 
injections or lost headcaps. Mice were randomly assigned to groups and attempts 
were made to balance groups according to variables such as age and housing 
condition. The investigators were not blinded to allocation during experiments, but 
were blinded to outcome assessment for all behavioral experiments. 

Mice. Mice were used in all experiments. For experiments involving Cre lines, 
mice were crossed for several generations to C57 mice before using. All wild-type 
mice were C57BL/6 mice obtained from The Jackson Laboratory (Bar Harbour, 
ME). For all behavioural experiments except those involving Htr2c“" mice, male 
mice ranging in age from 8-16 weeks were used. Female Htr2c“"’ mice were used 
in chemogenetic manipulations. Both male and female mice aged 6-20 weeks were 
used for slice electrophysiology and anatomical tracing experiments. All behav- 
ioural studies or tissue collection for ex vivo slice electrophysiology were performed 
during the light cycle. 

All behavioural experiments in Htr2c* mice were conducted at the University 
of Aberdeen and in accordance with the United Kingdom Animals (Scientific 
Procedures) Act of 1986. All in vivo electrophysiology experiments were conducted 
in accordance with all rules and regulations at the National Institute for Alcohol 
Abuse and Alcoholism at the National Institutes of Health. All other procedures 
were conducted in accordance with the National Institutes of Health guidelines for 
animal research and with the approval of the Institutional Animal Care and Use 
Committee at the University of North Carolina at Chapel Hill. 

All animals were group housed on a 12h light cycle (lights on at 7 a.m.) with 
ad libitum access to rodent chow and water, unless described otherwise. CRF- 
ires-Cre (Crf”) were provided by Bradford Lowell (Harvard University) and were 
previously described”!. C57BL/6] mice were obtained from the Jackson Laboratory 
(Bar Harbour, ME). To visualize CRF-expressing neurons, crf mice were crossed 
with either an Ai9 or a Cre-inducible L10-GFP reporter line (Jackson Laboratory)” 
to produce CRF-Ai9 or CRF-L10GFP progeny, referred to throughout the manu- 
script as CRF-reporters. Serf“ mice (from GENSAT) were a generous gift from 
Bryan Roth. Htr2c“ mice were supplied by Lora Heisler and are described in 
detail elsewhere’. 

Male mice were used for in vivo optogenetic behavioural experiments and for 
assessing the involvement of BNST CRF neurons on fluoxetine-induced enhance- 
ment of fear. Female 5-HT3¢-Cre mice were used in chemogenetic manipulations. 
Both male and female mice were used for slice electrophysiology and anatomical 
tracing experiments. All behavioural studies or tissue collection for ex vivo slice 
electrophysiology were performed during the light cycle. 

Viruses and tracers. All AAV viruses except INTRSECT constructs were produced 
by the Gene Therapy Center Vector Core at the University of North Carolina at 
Chapel Hill and had titres of >10!” genome copies per ml. For ex vivo and in vivo 
optical experiments, mice were injected with rAAV5-efl a-DIO-hChR2(H134R)- 
eYFP or rAAV5-efla-DIO-eYFP as a control. Red IX retrobeads (Lumafluor) 
were used to fluorescently label LH- and VTA-projecting BNST neurons during 
ex vivo slice electrophysiology recordings. The retrograde tracer Fluoro-Gold 
(Fluorochrome) was used for anatomical mapping. Choleratoxin B (CTB) 555 and 
CTB 657 retrograde tracers (Invitrogen; C34776, and C34778, respectively) diluted 
to 0.5% (w/v) in sterile PBS were used per injection site for anatomical mapping of 
collateral projections from BNST to LH and VTA. For chemogenetic manipula- 
tions, mice were injected with 400 nl of rAAV8-hsyn-DIO-hM3D(Gq)-mCherry, 
rAAV8-hsyn-DIO-hM4D(Gi)-mCherry, or rAAV8-hsyn-DIO-mCherry bilater- 
ally. HSV-hEFla-mCherry, HSV-efla-LSL1-mCherry-IRES-flpo, and HSV-efla- 
IRES-Cre (supplied by Rachel Neve at the McGovern Institute for Brain Research 
at MIT) were injected bilaterally into the VTA and LH at a volume of 500 nl per 
site. The INTRSECT construct AAVdj-hSyn-Con/Foff-hChR2(H134R)-EYFP was 
infused at 500 nl per side into the BNST. All AAV constructs had viral titres > 10!” 
genome particles per ml. 

Stereotaxic injections. All surgeries were conducted using aseptic technique. 
Adult mice (2-5 months) were deeply anaesthetized with 5% isoflurane (v/v) in 
oxygen and placed into a stereotactic frame (Kopf Instruments) while on a heated 
pad. Sedation was maintained at 1.5-2.5% isoflurane during surgery. An incision 
was made down the midline of the scalp and a craniotomy was performed above 
the target regions and viruses and fluorescent tracers were microinjected using 
a Neuros Hamilton syringe at a rate of 100 nl min’ !. After infusion, the needle 
was left in place for 10 min to allow for diffusion of the virus before the needle 
was slowly withdrawn. Injection coordinates (in mm, midline, Bregma, dorsal 
surface): BNST (+1.00, 0.30, —4.35), LH (-£0.9 to 1.10, —1.7, —5.00 to —5.2), 
VTA (—0.3, —2.9, —4.6), DR (0.0, —4.65, —3.2 with a 23° angle of approach). 
When using retrobeads, injection volumes into the LH and VTA were 300 nl and 


400 nl, respectively. Fluoro-Gold injection volumes were 200 nl per target site. CTB 
volumes were 200 nl per target site. An optical fibre was implanted in the BNST 
(+£1.00, 0.20, —4.15) at a 10° angle for in vivo photostimulation studies. After fibre 
implantation, dental cement was used to adhere the ferrule to the skull. Following 
surgery, all mice returned to group housing. Mice were allowed to recover for at 
least 3 weeks before being used for chemogenetic behavioural studies, or 6 weeks 
for in vivo optogenetic studies. 

Drugs. RS-102221, 5-HT and mCPP were from Tocris (Bristol, UK). For electro- 
physiology experiments, RS-102221 was made up to 100mM in DMSO and then 
diluted to a final concentration of 541M in aCSE. 5-HT and mCPP were stocked at 
10 and 20 mM, respectively, in ddH2O and diluted to their final concentations in 
aCSE For electrophysiology experiments, clozapine-N-oxide (CNO; from Bryan 
Roth) was stocked at 100 mM in DMSO and diluted to 10|1M in aCSFE. For behay- 
iour experiments, CNO was dissolved in 0.5% DMSO (in 0.9% saline) to a con- 
centration of 0.1 mg ml! or 0.3mg ml! and injected at 10 ml per kg for a final 
concentration of 1 or 3 mg per kg, i.p. Fluoxetine (Sigma) was made up in 0.9% 
NaCl to a concentration of mgm]! and then injected at 10 ml per kg for a final 
concentration of 10 mg per kg, i-p. 

In vivo electrophysiological procedures. Surgical procedures. Mice were anaes- 
thetized with 2% isoflurane (Baxter Healthcare, Deerfield, IL) and implanted 
with 2 x 8 electrode (351m tungsten) micro-arrays (Innovative Neurophysiology, 
Durham, NC) targeted at the BNST (ML: 0.8mm, AP: + 0.5mm, and DV: 
—4.15mm relative to Bregma). Following surgery, mice were singly housed and 
allowed at least one week to recover before behavioural testing. 

Fear conditioning. Fear conditioning took place in 27 x 27 x 11cm conditioning 
chambers (Med Associates, St. Albans, VT), with a metal-rod floor (context A) 
and scented with 1% vanilla. Mice received 5 parings of a pure tone CS with a 
0.6mA foot shock. 24h following conditioning, mice underwent a CS recall test 
(10 presentations of the CS alone, 5s ITI), which was conducted in a Plexiglas 
cylinder (20cm diameter) and scented with 1% acetic acid (context B). Stimulus 
presentations for both tests were controlled by MedPC (Med Associates, St. Albans, 
VT). Cameras were mounted overhead for recording freezing behaviour, which 
was scored automatically using CinePlex Behavioural Research System software 
(Plexon, Dallas, TX). 

Electrophysiological recording and single unit analysis. Electrophysiological 
recording took place during both fear conditioning and CS recall tests. Individual 
units were identified and recorded using Omniplex Neural Data Acquisition 
System (Plexon, Dallas, TX). Neural data was sorted using Offline Sorter (Plexon, 
Dallas, TX). Waveforms were isolated manually, using principal component anal- 
ysis. To be included in the analyses, spikes had to exhibit a refractory period 
of at least 1 ms. Autocorrelograms from simultaneously recorded units were 
examined to ensure that no cell was counted twice. Single units were analysed 
by generating perievent histograms (3 s bins) of firing rates from 30s before CS 
onset until 30s after CS offset (NeuroExplorer 5.0, Nex Technologies, Madison, 
AL). Firing rates were normalized to baseline (30s before CS onset) using 
z-score transformation. Analysis included a total of 139 cells over three days 
of recording. Data reported for raw firing rates include only putative principal 
neurons (<10 Hz). 

The formula for computing the suppression ratio was (average freezing rate) / 
(average freezing rate + average movement rate). Each cell was calculated individ- 
ually. A value of 0.5 =no change in rate). 

Ex vivo slice electrophysiology. Brains were sectioned at 0.07 (mm per s) ona 
Leica 1200S vibratome to obtain 300\1m coronal slices of the BNST, which were 
incubated in a heated holding chamber containing normal, oxygenated aCSF (in 
mM:124 NaCl, 4.4 KCl, 2 CaCl, 1.2 MgSOu, 1 NaH2POg, 10.0 glucose, and 26.0 
NaHCOs) maintained at 30 + 1°C for at least 1h before recording. Slices were 
transferred to a recording chamber (Warner Instruments) submerged in normal, 
oxygenated aCSF maintained at 28-30 °C at a flow rate of 2ml min‘. Neurons of 
the BNST were visualized using infrared differential interference contrast (DIC) 
video-enhanced microscopy (Olympus). Borosilicate electrodes were pulled with a 
Flaming-Brown micropipette puller (Sutter Instruments) and had a pipette resist- 
ance between 3-6 M®. Signals were acquired via a Multiclamp 700B amplifier, 
digitized at 10 kHz and analysed with Clampfit 10.3 software (Molecular Devices, 
Sunnyvale, CA, USA). 

Light-evoked action potentials. In Sert“” or Crf“ mice, fluorescently labelled 
neurons expressing ChR2 were visualized and stimulated with a blue (470nm) LED 
using a 1 Hz, 2 Hz, 5 Hz, 10 Hz, and 20 Hz stimulation protocol with a pulse width 
of 0.5 ms. Evoked action potentials were recorded in current clamp mode using a 
potassium gluconate based internal solution (in mM: 135 K* gluconate, 5 NaCl, 
2 MgCh, 10 HEPES, 0.6 EGTA, 4 Na,ATP, 0.4 Na,GTP, pH 7.3, 285-290 mOsmol). 
Light-evoked synaptic transmission. In Crf“ mice with ChR2 in the BNST and 
retrograde tracer beads in the VTA or LH, we visualized non-ChR2-expressing, 
beaded neurons using green (532 nm) LED. Recordings were conducted in voltage 
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clamp mode using a caesium-methansulfonate (Cs-Meth) based internal solution 
(in mM: 135 caesium methanesulfonate, 10 KCl, 1 MgCl, 0.2 EGTA, 2 QX-314, 
4 MgATP, 0.3 GTP, 20 phosphocreatine, pH 7.3, 285-290 mOsmol) so that we 
could detect EPSCs (—55 mV) and IPSCs (+10 mV) in the same neuron. After 
confirming the absence of a light-evoked EPSC signal, we measured light-evoked 
IPSCs during a single 5-ms light pulse of 470 nm. Ina subset of these experiments, 
SR95531 (GABAzine, 10|1M) was bath applied for 10 min to block IPSCs. 

Drug effects in CRF®*T neurons. Crf-reporter mice were injected with retro- 
grade tracer beads into the VITA (ML —0.5, AP —2.9, DV —4.6). We then recorded 
from beaded (VTA-projecting) and non-beaded (non-projecting) CRF neurons 
in the BNST. Acute drug effects were determined in current clamp mode in 
the presence of TTX using a potassium gluconate-based internal solution. After 
a 5-min stable baseline was established, 5-HT (101M) or mCPP (201M) was 
bath applied for 10 min while recording changes in membrane potential. The 
difference in membrane potential between baseline and drug application at peak 
effect (delta or A MP) was later determined. In a subset of mCPP experiments, 
slices were incubated with RS-102221 (51M) for at least 20 min before experiments 
began. 

Synaptic transmission. Spontaneous inhibitory postsynaptic currents (sIPSCs) 
were assessed in voltage clamp using a potassium-chloride gluconate-based intra- 
cellular solution (in mM: 70 KCl, 65 Kt-gluconate, 5 NaCl, 10 HEPES, 0.5 EGTA, 
4 ATP, 0.4 GTP, pH 7.2, 285-290 mOsmol). IPSCs were pharmacologically isolated 
by adding kynurenic acid (3 mM) to the aCSF to block AMPA and NMDA recep- 
tor-dependent postsynaptic currents. The amplitude and frequency of sIPSCs were 
determined from 2 min recording episodes at —70 mV. The baseline was averaged 
from the 4 min preceding the application of 5-HT (10|1M) or mCPP (10,1M) for 
10min. Ina subset of these experiments, RS-102221 (541M) was added to the aCSF 
and slices were incubated in this drug solution for at least 20 min before experi- 
ments began. For miniature IPSCs (mIPSCs), TTX was included in the aCSF to 
block network activity. 

In Sert®::ChR28NS? mice with retrograde tracer beads in the VTA, sIPSCs 
were recorded as described above. After achieving a stable baseline, a 10s, 20 Hz 
photostimulation was applied. 

For assessment of spontaneous excitatory postsynaptic currents (sEPSCs), a cae- 
sium gluconate-based intracellular solution was used (in mM: 135 Cs*-gluconate, 
5 NaCl, 10 HEPES, 0.6 EGTA, 4 ATP, 0.4 GTP, pH 7.2, 290-295 mOsmol). AMPAr- 
mediated EPSCs were pharmacologically isolated by adding 25 |1M picrotoxin to 
the aCSF. sEPSC recordings were acquired in 2 min recording blocks at —70 mV. 
Fast-scan cyclic voltammetry (FSCV). Electrodes were fabricated as previously 
described and cut to 50-100,1m in length”. Animal and slice preparation were as 
described above for electrophysiology and slices were perfused on the rig in ACSF. 
Using a custom-built potentiostat (University of Washington Seattle), 5-HT record- 
ings were made in the BNST using TarHeel CV written in laboratory view (National 
Instruments). Briefly a triangular waveform (—0.1 V to 1.3 V with a 10% phase shift 
at 1,000 V per s, versus Ag/AgCl) was applied to the carbon fibre electrode at a rate 
of 10 Hz. Slices were optically stimulated with 20 5-ms blue (490 nm) light pulses at 
a rate of 20 Hz down the submerged 40 x objective. 10 cyclic voltammograms were 
averaged before optical stimulation for background subtraction. Voltammograms 
were digitally smoothed one time with a fast Fourier transform following data 
collection and analysed with HDCV (UNC Chapel Hill). Fluoxetine (101M) was 
bath applied following a stable baseline (20 min). 

Behavioural assays. For chemogenetic manipulations, mice were transported to 
a holding cabinet adjacent to the behavioural testing room to habituate for at least 
30 min before being pretreated with CNO (3 mg per kg, i.p. for Crf“” mice and 1 mg 
per kg, i.p. for Htrc“” mice). All behavioural testing began 45 min following CNO 
treatment, with the exception of fear conditioning training, which occurred 30 min 
after a CNO injection. When assessing the effect of fluoxetine on fear conditioning, 
fluoxetine (10 mg per kg, i-p.), or vehicle, was administered 1h before training 
(30 min before CNO treatment). For optogenetic manipulations, mice received 
bilateral stimulation (473 nm, ~10mW, 5 ms pulses, 20 Hz) when specified. Unless 
specified, all equipment was cleaned with a damp cloth between mouse trials. All 
sessions were video recorded and analysed using EthoVision software (Noldus 
Information Technologies) except where noted. 

Elevated plus maze. Mice were placed in the centre of an elevated plus maze and 
allowed to explore during a 5 min session. Light levels in the open arms were ~14 
lux. During optogenetic manipulations mice received bilateral stimulation during 
the entire 5 min session. Mice that left the maze were excluded from analysis (n =2 
control, 1 ChR2 from optogenetic experiments). 

Open field. Mice were placed into the corner of a white Plexiglas open field arena 
(25 x 25 x 25cm) and allowed to freely explore for 30 min. The centre of the open 
field was defined as the central 25% of the arena. For optogenetic studies the 30 min 
session was divided into three 10-min epochs consisting of stimulation off, stimu- 
lation on, and stimulation off periods. 
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Novelty-induced suppression of feeding. 48 h before testing, mice were provided 
with access to a single piece of Froot Loops cereal (Kellogg’s) in their home cage. 
24h before testing, home cage chow was removed and mouse body weights were 
recorded. Water remained available ad libitum. Beginning at least one hour before 
testing, mice transferred to new clean cages so they were singly housed for the test 
session and body weights were recorded. During the test session mice were placed 
into an arena (25 x 25 x 25cm) that contained a single Froot Loop on top of a piece 
of circular filter paper. Mice were monitored by a live observer and the latency 
for the mouse to begin eating the pellet was measured, allowing up to 10 min. 
All mice began eating within this time. Following the initiation of feeding, mice 
were removed from the arena and placed back into their home cages. Mice were 
then provided with 10 min of access to a pre-weighed amount of Froot Loops for 
a post-test feeding session. After this 10 min post-test, the remaining Froot Loops 
were weighed and mice were returned to ad libitum home cage chow. Mice were 
returned to group housing at the end of this session. For optogenetic experiments, 
mice received constant 20 Hz optical stimulation during both the latency to feed 
assay and the 10 min post-test. During optogenetic experiments, one control mouse 
did not feed during the 10 min NSF session and was excluded from the results. 
Home cage feeding. Seri“ mice were food deprived for 24h. On the day of the 
experiment, mice were acclimated to the behaviour room for 1h. A single pre- 
weighed food pellet was placed in the home cage and the mice were allowed to eat 
for 10 min during optogenetic stimulation. At the end of the experimental session, 
the pellet was removed and weighed and mice were given ad libitum access to food. 
HtrzcC mice were acclimated in metabolic chambers (TSE Systems, Germany) 
for 2 days before the start of the recordings. After acclimation, mice were food 
deprived for 24h. Following fasting, mice received an i.p. injection of CNO 30 min 
before food presented again. Mice were recorded for 12 h with the following meas- 
urements being taken every 30 min: water intake, food intake, ambulatory activity 
(in x and z axes), and gas exchange (O2 and CO,) (using the TSE LabMaster system, 
Germany). Energy expenditure was calculated according to the manufacturer's 
guidelines (PhenoMaster Software, TSE Systems). 
Fear conditioning. We used a three-day protocol to assess both cued and contex- 
tual fear recall. On the first day, mice were placed into a fear conditioning chamber 
(Med Associates) that contained a grid floor and was cleaned with a scented paper 
towel (19.5% ethanol, 79.5% H2O, 1% vanilla). After a 3 min baseline period, mice 
were exposed to a 30 tone (3 kHz, 80 dB) that co-terminated with a 2s scrambled 
foot shock (0.6mA). A total of 5 tone-shock pairings were delivered with a random 
inter-tone interval (ITI) of 60-120s. For optogenetic studies, light stimulation 
occurred only during the 30-s tones of this session. Following delivery of the last 
foot shock, mice remained in the conditioning chamber for a 2-min consolida- 
tion period. 24h later, mice were placed into a separate conditioning box (Med 
Associates) that contained a white Plexiglas floor, a striped pattern on the walls, 
and was cleaned and scented with a 70% ethanol solution. After a 3 min baseline 
period, mice were presented with 10 tones (30s, 3 kHz, 80 dB) with a 5s ITI. Mice 
remained in the chamber after the last tone for a two-minute consolidation period. 
24h later (48h after training), mice were returned to the original training cham- 
ber for 5 min. For each session, freezing behaviour was hand-scored every 5s by 
a trained observer blinded to experimental treatment as described previously”*. 
Freezing was defined as a lack of movement except as required for respiration. 
Immunohistochemistry and histology. All mice used for behavioural and ana- 
tomical tracing experiments were anesthetized with Avertin and transcardially 
perfused with 30 ml of ice-cold 0.01 M PBS followed by 30 ml of ice-cold 4% para- 
formaldehyde (PFA) in PBS. Brains were extracted and stored in 4% PFA for 24h 
at 4°C before being rinsed twice with PBS and stored in 30% sucrose and PBS until 
the brains sank. 45 1m slices were obtained on a Leica VT100S and stored in 50/50 
PBS/Glycerol at —20°C. DREADD or ChR2-containing sections were mounted 
on slides, allowed to dry, coverslipped with VectaShield (Vector Labs, Burlingame, 
CA), and stored in the dark at 4°C. 
Tryptophan hydroxylase/Fluoro-Gold/c-fos triple labelling. We stained 
free-floating dorsal raphe sections using indirect immunofluorescence sequentially 
for first tryptophan hydroxylase (TPH) and Fluoro-Gold (FG) and then c-fos. For 
TPH/EFG, we washed sections 3 x for 5 min with 0.01 M PBS, permeabilized them 
for 30 min in 0.5% Triton/0.01 M PBS, and washed the sections again 2 with 
0.01 M PBS. We blocked the sections for 1h in 0.1% Triton/0.01 M PBS containing 
10% (v/v) normal donkey serum and 1% (w/v) bovine serum albumin (BSA). We 
then added primary antibodies (1:500 mouse anti- TPH (Sigma Aldrich T0678) 
and 1:3,000 guinea-pig anti-Fluoro-Gold (Protos Biotech NM101)) to blocking 
buffer and incubated the sections overnight at 4 °C. The next day, we washed the 
sections 3 x for 5 min with 0.01 M PBS, then incubated them with 1:500 with Alexa 
Fluor 647-conjugated donkey anti-mouse and Alexa Fluor 488-conjugated donkey 
anti-guinea pig secondary antibodies for 2h at room temperature, and washed the 
sections 4 for 5 min with 0.01 M PBS. We then proceeded directly to the c-fos tyr- 
amide signal amplification based immunofluorescent staining. We permeabilized 
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the sections in 50% methanol for 30 min, then quenched endogenous peroxidase 
activity in 3% hydrogen peroxide for 5 min. Followed by two 10 min washes in 
0.01 M PBS, we blocked the sections in PBS containing 0.3% Triton X-100 and 
1.0% BSA for Lh. c-fos primary antibody (Santa Cruz Biotechnology, -sc-52) was 
added to sections at 1:3,000 and sections were incubated for 48h at 4°C. On day 3, 
we washed the sections in TNT buffer (0.1 M Tris-HCl pH 7.5, 0.15 M NaCl, 
0.05% Tween-20) for 10 min, blocked in TNB buffer (0.1 M Tris-HCl pH 7.5, 0.15M 
NaCl, 0.5% blocking reagent —- PerkinElmer FP1020) buffer for 30 min. We then 
incubated the sections in secondary antibody (goat anti-rabbit HRP-conjugated 
PerkinElmer) 1:200 in TNB buffer for 30 min, washed the sections in TNT buffer 
4x for 5 min, and then incubated the sections in Cy3 dye diluted in TSA ampli- 
fication diluents for 10 min. We washed the sections 2 in TNT buffer, mounted 
them on microscope slides. We coverslipped the slides using Vectashield mounting 
medium. We acquired 4-5 of 2 x 4 tiled z-stack (5 optical slices comprising 7 1m 
total) images of the dorsal raphe from each naive and shock mouse on a Zeiss 800 
upright confocal microscope. Scanning parameters and laser power were matched 
between groups. Images were preprocessed using stitching and maximum inten- 
sity projection and then analysed using an advanced processing module in Zeiss 
Zen Blue that allows nested analysis of multiple segmented fluorescent channels 
within parent classes. Double-labelled and triple-labelled cells were validated in 
a semi-automated fashion. At least 4 sections per mouse were counted in this 
way. One mouse was identified as a significant outlier in the shock group and was 
excluded from further analysis. 

Sert“*::ChR2, and Crf™’stChR2 validation. To verify expression of ChR2- 
expressing fibres in the BNST originating from DRN serotonergic neurons, 
300 um slices used for ex vivo electrophysiological recordings containing the 
DRN and BNST were stored in 4% paraformaldehyde at 4°C for 24h before being 
rinsed with PBS, mounted, and coverslipped with Vectashield mounting medium. 
Images showing eYFP fluorescence from the DRN and BNST were obtained on 
a Zeiss 800 upright confocal microscope using a 10 objective and tiled z stacks. 
To validate the INTRSECT construct, mice received injections of HSV-hEF la- 
mCherry or HSV-efla-LSL1-mCherry-IRES-flpo to both the LH and VTA bilat- 
erally (n=4 and 5, respectively). Both groups received AAVDJ-hSyn-Cre-on/ 
Flp-off-hChR2(H134R)-EYFP to the BNST bilaterally. Six weeks following injection, 
mice were perfused and tissue was collected as described above. To visualize YFP 
expression in the BNST of Crf::Intrsect®\‘? mice, free-floating slices contain- 
ing the BNST were rinsed three times with PBS for 5 min each. Slices were then 
incubated in 50% methanol for 30 min then incubated in 3% hydrogen peroxide 
for 5 min. Following three 10-min washes in PBS, slices were incubated in 0.5% 
Triton X-100 for 30 min followed by a 10 min PBS wash. Slices were blocked in 10% 
normal donkey serum/0.1% Triton X-100 for 1h, and then they were incubated 
overnight at 4°C with a primary chicken anti-GFP antibody (GFP-1020, Aves) 
at 1:500 in blocking solution. Following primary incubation, slices were rinsed 
three times with 0.01M PBS for 10 min each and incubated with a fluorescent 
secondary antibody (AlexaFluor 488 donkey anti-chicken) at 1:200 in PBS for 2h 
at room temperature. Slices were then rinsed with four 10-min PBS washes before 
being mounted onto glass slides and coverslipped with Vectashield with DAPI. 
A3 x A tiled z stack (7 optical sections comprising 351m total) image from both 
the left and right hemispheres of the BNST was obtained at 20x magnification 
using a Zeiss 800 upright confocal microscope. Scanning parameters and laser 
power were matched between groups. Images were preprocessed using stitching 
and maximum-intensity projection. The number of fluorescent cells in the dorsal 
and ventral aspects of the BNST were counted by a blinded scorer using the cell 
counter plug-in in FIJI (ImageJ). Each hemisphere was considered independently 
per mouse. One mouse in the flp-expressing group was a significant outlier for 
number of cells expressed in a ventral BNST hemisphere (ROUT, Q=0.1%) and 
all data from that mouse were excluded. 

Choleratoxin retrograde tracer studies in CRF reporter mice. 3 male CRF-L10a 
reporter mice were injected with 200 nl of CTB 555 and CTB 647 bilaterally to the 
LH and VTA, respectively, as described above. 5 days following injection, mice were 
perfused as described above, the brains were extracted, and were stored in 4% par- 
aformaldehyde for 24h at 4°C before being rinsed with PBS and transferred to 30% 
sucrose until the brains sank. 451m sections containing the BNST were collected 
as described above. Sections containing the BNST were mounted on glass slides 
and coverslipped using Vectashield. An image from the left and right hemispheres 
of a medial section of the BNST was obtained on a Zeiss 800 upright microscope 
using a 20 x objective and 3 x 5 tiled z stacks (5 optical slices comprising 7 |1m total). 
Images were preprocessed using stitching and maximum intensity projection, and 


were then analysed using the cell counter function in FIJI (ImageJ). Only cells posi- 
tive for GFP (putative CRF neurons) were considered. Cells were scored exclusively 
as either 555+ only (LH-projecting), 647+ only (VTA-projecting), 555+ and 647+ 
(projecting to both LH and VTA), or 555— and 647— (unlabelled; neither LH- nor 
VTA- projecting). The total number of CRF neurons scored was calculated as the 
sum of all four groups, and percentages of each type were calculated from this 
value. Each hemisphere was scored and plotted independently (n= 6 images from 
3 mice), and the dorsal and ventral BNST were considered separately. The average 
values were plotted as pie charts (Extended Data Fig. 5). 

Double fluorescence in situ hybridization (FISH). For validation of 2C-cre line 
and comparison of CRF/2C mRNA cellular co-localization, mice were anesthetized 
using isoflurane, rapidly decapitated, and brains rapidly extracted. Immediately 
after removal, the brains were placed on a square of aluminium foil on dry ice 
to freeze. Brains were then placed in a —80°C freezer for no more than 1 week 
before slicing. 12,1m slices were made of the BNST on a Leica CM3050S cry- 
ostat (Germany) and placed directly on coverslips. FISH was performed using 
the Affymetrix ViewRNA 2-Plex Tissue Assay Kit with custom probes for CRE, 
5-HT ¢, and Cre designed by Affymetrix (Santa Clara, CA). Slides were cover- 
slipped with SouthernBiotech DAPI Fluoromount-G. (Birmingham, AL). 3 x 5 
tiled z stack (15 optical sections comprising 141m total) images of the entire 12j1m 
slice were obtained on a Zeiss 780 confocal microscope for assessment of CRF/2C 
colocalization. A single-plane 40 x tiled image of a CRF/2C slice was obtained on a 
Zeiss 800 upright confocal microscope for the magnified image shown in Extended 
Data 6b, right. 3 x 5 tiled z stack (7 optical sections comprising 18 j1m) images of 
2C/Cre slices were obtained on a Zeiss 800 upright confocal microscope for the 
2C/Cre validation. All images were preprocessed with stitching and maximum 
intensity projection. An image of the BNST from 3 mice in each condition was 
hand counted for each study using the cell counter plugin in FIJI (ImageJ). Cells 
were classified into three groups: probe 1*, probe 2°, or probe 1 and 2°. Only 
cells positive for a probe were considered. Results are plotted as average classified 
percentages across the three images. 

Group assignment. No specific method of randomization was used to assign 
groups. Animals were assigned to experimental groups so as to minimize the 
influence of other variables such as age or sex on the outcome. 
Inclusion/exclusion criteria. Pre-established criteria for excluding mice from 
behavioural analysis included (1) missed injections, (2) anomalies during behav- 
ioural testing, such as mice falling off the elevated plus maze, (3) damage to or loss 
of optical fibres, (4) statistical outliers, as determined by the Grubb’ test. 
Sample size. A power analysis was used to determine the ideal sample size for 
behaviour experiments. Assuming a normal distribution, a 20% change in mean 
and 15% variation, we determined that we would need 8 mice per group. In some 
cases, mice were excluded due to missed injections or lost optical fibres resulting 
in fewer than 8 mice per group. For electrophysiology experiments, we aimed for 
5-7 cells from 3-4 mice. 

Statistics. Data are presented as means + s.e.m. For comparisons with only two 
groups, P values were calculated using paired or unpaired t-tests as described in 
the figure legends. Comparisons across more than two groups were made using 
a one-way ANOVA, and a two-way ANOVA was used when there was more than 
one independent variable. A Bonferroni post-test was used following significance 
with an ANOVA. In cases in which ANOVA was used, the data met the assump- 
tions of equality of variance and independence of cases. If the condition of equal 
variances was not met, Welch's correction was used. Some of the sample groups 
were too small to detect normality (<8 samples) but parametric tests were used 
because nonparametric tests lack sufficient power to detect differences in small 
samples (Graphpad Statistics Guide — http://www.graphpad.com). The standard 
error of the mean is indicated by error bars for each group of data. Differences 
were considered significant at P values below 0.05. All data were analysed with 
GraphPad Prism software. 
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Extended Data Figure 1 | See next page for caption. 
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Extended Data Figure 1 | In vivo recordings in BNST neurons during 
fear conditioning reveal opposite patterns of activation during 
acquisition and recall. a, b, Representative neuronal firing rate (a) 

and population Z score of the firing rate (b) for BNST neurons (n = 45 
cells from 7 mice) 30s before conditioned stimulus (tone), during the 
conditioned stimulus (CS), and 30s after the unconditioned stimulus. 
c, Percentage time spent freezing during fear acquisition, cued fear 
recall and contextual fear recall. d, Electrode placements for BNST 
recordings. e, Raw firing rates during freezing (blue) versus movement 
(red) epochs were averaged across all putative principal neurons (firing 
rate < 10 Hz). Acquisition: cells in BNST exhibited greater average 
firing rates during freezing epochs compared to movement epochs 
during CS3 (t44= 2.88, P< 0.01, Student's unpaired two-tailed t-test), 
CS4 (t44 = 3.14, P< 0.01, Student’s unpaired two-tailed t-test), and CS5 
(t44= 4.4, P< 0.001, Student’s unpaired two-tailed t-test) (n = 45 cells from 
7 mice). CS recall: average firing rates during freezing epochs decreased 
over CS presentations such that firing during block 5 was significantly 
less than block 1 (t4; = 3.44, P= 0.001, Student’s unpaired two-tailed 
t-test). Freezing firing rates during block 5 were also significantly less 
than movement epochs during block 5 (ts; = 4.03, P< 0.001, Student’s 
unpaired two-tailed t-test) (n = 42 cells from 7 mice). CX test: average 


firing rate was significantly greater during movement versus freezing 
epochs during minute 1 (t44= 4.83, P< 0.001, Student’s unpaired two- 
tailed t-test), minute 2 (t44= 3.17, P< 0.01, Student’s unpaired two-tailed 
t-test), and minute 5 (t44 = 4.36, P< 0.001, Student’s unpaired two-tailed 
t-test) (n=45 cells from 7 mice). f, Freezing-related changes in firing 
rates during the CS were determined by measuring the ratio of average 
firing rates during freezing versus movement epochs for each session. 
Acquisition: activity during freezing epochs increased significantly relative 
to movement epochs during CS4 (t4s = 3.26, P< 0.01, Student’s unpaired 
two-tailed t-test) and CS5 (t4s = 2.17, P< 0.05, Student’s unpaired two- 
tailed t-test) (n= 46 cells from 7 mice). CS recall: freezing significantly 
suppressed activity relative to movement epochs during the last two CS 
presentations (t47 = 5.29, P= <0.001, Student’s unpaired two-tailed 

t-test) (n= 48 cells from 7 mice). CX test: freezing significantly suppressed 
activity during minutes 1 (t44= 6.06, P < 0.001, Student’s unpaired two- 
tailed t-test), minute 2 (t44= 2.92, P< 0.01, Student’s unpaired two-tailed 
t-test), and minute 5 (t44= 3.55, P=.001, Student’s unpaired two-tailed 
t-test) (n=45 cells from 7 mice). g, Plots showing correlation between 
freezing behaviour and firing rate of BNST neurons across sessions and for 
all sessions. Data are mean +s.e.m. *P < 0.05 **P < 0.01; ***P< 0.001. 
Scale bar, 100 1m. 
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Extended Data Figure 2 | Effects of optogenetic stimulation of 
5-HT inputs to the BNST on feeding, anxiety and locomotion. 


a-c, Sert®::ChR2PRN5NST mice exhibited reduced probability (t)5 = 2.67, 
P<0.05, Student’s unpaired two-tailed t-test, n = 8 control, n =9 ChR2) 
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n=8 control, n=9 ChR2) to enter the open arms of the EPM without 
exhibiting locomotor deficits. d, e, Photostimulation of 5-HTDPRN-BNST 
terminals had no effect on locomotor activity in the open field (d) (n=9 
control, n= 11 ChR2) or home cage feeding (e) (n= 4 control, n=6 
ChR2). Data are mean+s.e.m. *P< 0.05. 
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Extended Data Figure 3 | Chemogenetic activation of 5-HT2cR- 
expressing neurons in the BNST increases anxiety-like behaviour. 
a, Confocal images of coronal BNST slices obtained from Htr2c* 
mice following double fluorescence in situ hybridization for 5-HTzcR 
and Cre. Yellow arrows indicate cells in which there is co-localization, 
red arrows indicate cells in which only Cre is expressed and green 
arrows indicate cells in which only 5-HT2cR is expressed. b, Pie chart 
representing the distribution of genetic markers in BNST neurons. 

c, Experimental configuration in Htr2c*::hM3DqPNST mice. d, Coronal 
images showing c-fos induction in 5-HTcR expressing neurons in 

the BNST of Htr2c©*::hM3Dq®%*? or Htr2c©*::mCherry®\? mice 
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following CNO injection. e, Bath application of CNO depolarized 
5-HT2cR-expressing neurons expressing hM3Dq in slice (n = 3 cells from 
3 mice). f, Chemogenetic stimulation of 5-HT2cR expressing neurons 

in BNST increased latency to feed in the NSF (f;; =2.591, P< 0.05, 
Student’s unpaired two-tailed t-test, n = 6; mCherry, n= 7 hM3Dq). 

g, Chemogenetic activation of 5-HT2cR-expressing BNST neurons had no 
effect on home cage feeding (n =5 mCherry, n= 6 hM3Dq). h, Confocal 
images from Htr2c“*::mCherry®\! mice showing mCherry expression 
in 5-HT,cR-expressing soma in the BNST and fibres in the LH and VTA. 
Data are mean +s.e.m. *P< 0.05. Scale bar, 100,1m. 
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Extended Data Figure 4 | Electrophysiological characterization of 
5-HT responses and 5-HT receptor expression in CRF®N* neurons. 
a, A pie chart showing the distribution of CREP®NST neurons that were 
depolarized, hyperpolarized, or had no response to 5-HT (n=8 cells 
from 4 mice). b, Coronal images of the BNST showing co-localization of 


5-HT2cRs with CRF mRNA using double fluorescence in situ hybridization. 


c, d, Histograms showing the percentage of 5-HT 2 neurons that express 
CRE and the percentage of CRF neurons that express 5-HT2cRs in the 
BNST (n=3 slices from 3 mice). e, Recording configuration in CRF®NS? 
neurons. f, Slice electrophysiology in BNST of Crf reporter mice showing 
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depolarization of all (VTA-projecting and non-projecting) CRF neurons 
following bath application of the 5-HT> receptor agonist mCPP (n= 12 
cells from 6 mice) and blockade of this response by the 5-HT¢ receptor 
antagonist RS-102221 (n=5 cells from 3 mice). g, Change in membrane 
potential induced by mCPP (t)2 = 2.18, P< 0.05, one-sample t-test, n = 13 
cells from 6 mice) is blocked by a 5-HT»cR antagonist (n =5 cells from 

3 mice). h, mCPP selectively depolarizes non-VTA-projecting CRF®NSt 
neurons (n=5 cells from 2 mice non-VTA-projecting CRE, n=5 cells from 
4 mice VTA-projecting CRF). Data are mean +s.e.m. *P < 0.05. 
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Extended Data Figure 5 | 5-HT activates inhibitory microcircuits in 

the BNST that modulate outputs to the LH. a, Recording configuration 

in CRF reporter mice infused with retrograde tracer beads in the LH. 

b, Average traces of 5-HT induced depolarization in LH projecting 

versus non-projecting neurons. c, Histograms showing 5-HT induced 

depolarization in non-LH projecting BNST neurons (t4= 4.425, P< 0.05, 

one-sample t-test, n =5 cells from 3 mice) and hyperpolarization in LH- 

projecting neurons (f5 = 2.789, P< 0.05, one-sample t-test, n = 6 cells 

from 3 mice). d, Confocal image of retrogradely CTB-labelled VTA (red) 

and LH (green) outputs in a CRF-L10a reporter (blue). e, f, Pie charts 

depicting the percentage of LH-projecting only, VTA-projecting only, 

collateralizing, and CTB-negative (unlabelled) CRF in neurons in the 

dorsal and ventral aspects of the BNST (n =6 hemispheres from 3 mice). 

g, Experimental schematic depicting viral infusions into the BNST and 

retrograde tracer bead infusions into the LH of Crf®::ChR2°N*T mice. 

h, Recording configuration in Crf(’::ChR22NS? mice with LH tracer beads. 
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i, Representative trace of light evoked IPSCs in LH-projecting neurons 
(n=7 cells from 4 mice) and blockade of this light evoked response by 
GABAzine (n=2 cells from 2 mice). j, Recording configuration in VTA- 
projecting neurons in the BNST of C57BL/6 mice. k, I, 5-HT has no effect 
on miniature IPSC frequency or amplitude in BNST— VTA projecting 
neurons (n=7 from 4 mice). m, n, 5-HT has no effect on sIPSC frequency 
or amplitude in the presence of the 5-HTcR antagonist RS-102221 (n=5 
cells from 4 mice). 0, Recording configuration in LH projecting neurons in 
the BNST of C57BL/6 mice. p, Representative traces showing an increase 
in sIPSC frequency in the presence of 5-HT for 6 cells from 3 mice. 
q, ¥, 5-HT increases sIPSC frequency but not amplitude in BNST—-LH 
projecting neurons (F};,55= 11.65, P< 0.01, repeated measures one-way 
ANOVA, n= 6 cells from 3 mice). s, t, 5-HT has no effect on miniature 
IPSC frequency or amplitude (n =5 cells from 3 mice). u, v, 5-HT has 
no effect on sIPSC frequency or amplitude in the presence of RS-102221 
(n=6 cells from 4 mice). Data are mean +s.e.m. *P < 0.05. 
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Extended Data Figure 6 | 5-HT does not alter GABAergic transmission 
in CRF neurons nor does it directly excite non-CRF VTA-projecting 
neurons in the BNST. a, Recording configuration in CRF®N*T neurons 

in a CRF reporter. b, c, 5-HT has no effect on sIPSC frequency or 
amplitude in the total population of CRF neurons (n =5 cells from 

3 mice). d, Recording configuration in non-CRE, VTA-projecting neurons 
in the BNST and average trace of 5-HT effect on membrane potential 
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in non-CRE, VTA-projecting neurons in the presence of tetrodotoxin. 

e, Histogram summarizing 5-HT effects on membrane potential in local 
and VTA-projecting CRF neurons and local CRF neurons in the presence 
of the 5-HT¢ receptor antagonist RS-102221 (same data shown in Fig. 2b) 
juxtaposed with the lack of effect of 5-HT on membrane potential in non- 
CRE, VTA-projecting neurons (f,= 0.9381, ns, one-sample t-test, n=5 


cells from 3 mice). Data are mean 4 
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Extended Data Figure 7 | The 5-HT2 agonist mCPP increases in the BNST of C57BL/6 mice. c, d, mCPP has no effect on spontaneous 
GABAergic but not glutamatergic transmission in the BNST. excitatory postsynaptic current (sEPSC) frequency or amplitude in the 
a, b, mCPP increases sIPSC frequency (F5,30 = 1.863, P< 0.001, Repeated BNST of C57BL/6 mice (n=5 cells from 3 mice). Data are mean +$.e.m. 
measures one-way ANOVA, n= 3 cells from 3 mice) but not amplitude *P< 0,05, 
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Extended Data Figure 8 | Optogenetic and intrsectional 
characterization of 5-HT-CRF circuits in the BNST and outputs to 
the midbrain. a, Experimental design and recording configuration from 
Sert©ChR2PRN~BNST mouse with retrograde tracer beads in the VTA. 
b, Representative traces for 5 cells from 3 mice depicting the increase in 
sIPSCs in VTA-projecting neurons in the BNST following light-evoked 
5-HT release. c, Histogram summarizing the effect of light evoked 5-HT 
release on sIPSC frequency in VTA-projecting neurons (t4 = 4.890, 
P<0.01, one-sample t-test, n =5 cells from 3 mice). d, Experimental 
configuration in Crf©”::Intrsect-ChR2®N*! mice. e, Representative images 
from 4 Crf@':: HSV-LSL1-mCherry-flpo’ 4 mice and 4 Crf@"::HSV- 
LSL1-mCherryY™/""! mice injected with Intrsect-ChR2-eYFP in the 
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BNST. f, Cell counts of eYFP* neurons from HSV-LSL1-flpo and HSV- 
LSLI-mCherry injected Crf©”::Intrsect-ChR2®N*! mice indicating the 
number of non-projecting CRF neurons compared to the total CRF 
population in the dorsal (top panel; t;4= 1.959, ns, Student’s unpaired 
two-tailed t-test, n =4 mice, 8 hemispheres per group) and ventral aspects 
of the BNST (bottom panel; t7 = 2.431, P< 0.05, Student’s unpaired 
Welch's corrected two-tailed t-test, n = 4 mice, 8 hemispheres per group). 
g, Recording configuration and light-evoked IPSC showing local GABA 
release from non-projecting CRF neurons in the BNST. h, Sterotaxic 
injection of ChR2 in Crf©” mouse. i, j, Light evoked IPSCs in the VTA and 
LH indicating that CRF projections to these regions are GABAergic. Data 
are mean +s.e.m. *P< 0.05; **P< 0.01. 


© 2016 Macmillan Publishers Limited, part of Springer Nature. All rights reserved 


LETTER 


a b - 


100 
ae a 
Ja fa nts ns 80 O Vehicle, Vehicle 
Lf 24h . 
£ 60 @ Fluoxetine, Vehicle 
oO > 2 
+r PP & = @ I Vehicle, CP 154,526 
x Roa a” roy 
& rg 20 | Fluoxetine, CP 154,526 
(ou 
0 
Cued Fear Recall 
Cc d e 
Antalarmin 250 Antalarmin 
Retrobeads 


5HT 


= N 
eal fo) 
fo) o 

1 1 


~@-@- 


Extended Data Figure 9 | Pharmacological blockade of CRF, receptors d, Bath application of a CRF)R antagonist blocks the 5-HT induced 
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of GABAergic transmission in the BNST. a, Experimental schedule (Fi0,30 = 0.2213, ns, Repeated measures one-way ANOVA, n= 4 cells 

of injections and behaviour. b, CRF,R antagonist does not modify fear from 2 mice). e, There was a reduction in sIPSC amplitude during 
acquisition but reduces fluoxetine enhancement of cued fear recall 5-HT bath application and CRF|R blockade (Fj0,39 = 2.941, P< 0.05, 
(Fi,20= 13.70, P< 0.01, two-way ANOVA, n=6 per group). c, Recording Repeated measures one-way ANOVA, n= 4 cells from 2 mice). Data are 
configuration in BNST neurons that project to the LH in C57BL/6 mice. mean +s.e.m. **P<0.01. 


© 2016 Macmillan Publishers Limited, part of Springer Nature. All rights reserved. 


LETTER 


i 5-HT 
Mm) GABA 


Anxiogenic 
Increased fear learning 


5-HT,R 


BNST 


Anxiolytic LH/VTA 
Stress-buffering 


Extended Data Figure 10 | Model of a serotonin-sensitive inhibitory inhibited by 5-HT, send direct, inhibitory projections to the VTA and LH. 
microcircuit in the BNST that modulates anxiety and aversive learning. | These CRF?‘ output neurons are GABAergic and putatively anxiolytic 
Serotonin inputs to the BNST activate 5-HT2cRs expressed in non- and stress buffering. Blue dashed lines indicate hypothesized additional 
projecting ‘local’ CRF neurons. These local CRF neurons promote anxiety synapses between CRF®NST neurons. Dashed red line indicates a putatively 
and fear by inhibiting anxiolytic outputs to the VTA and LH that are GABAergic synapse. 


putatively GABAergic. Another discrete subset of CRF neurons, which are 
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HER2 expression identifies dynamic functional 
states within circulating breast cancer cells 


Nicole Vincent Jordan!, Aditya Bardia’, Ben S. Wittner!*, Cyril Benes!, Matteo Ligorio!*, Yu Zheng!, Min Yu'+, 
Tilak K. Sundaresan!?, Joseph A. Licausi!, Rushil Desai!, Ryan M. O’Keefe', Richard Y. Ebright', Myriam Boukhali!, 
Srinjoy Sil!, Maristela L. Onozato!*, Anthony J. Jafrate!*, Ravi Kapur®, Dennis Sgroi'*, David T. Ting!*, Mehmet Toner*°, 


Sridhar Ramaswamy!, Wilhelm Haas!’, Shyamala Maheswaran!? & Daniel A. Haber 


Circulating tumour cells in women with advanced oestrogen- 
receptor (ER)-positive/human epidermal growth factor receptor 
2 (HER2)-negative breast cancer acquire a HER2-positive 
subpopulation after multiple courses of therapy’. In contrast to 
HER2-amplified primary breast cancer, which is highly sensitive 
to HER2-targeted therapy, the clinical significance of acquired 
HER2 heterogeneity during the evolution of metastatic breast 
cancer is unknown. Here we analyse circulating tumour cells 
from 19 women with ERt/HER27 primary tumours, 84% of 
whom had acquired circulating tumour cells expressing HER2. 
Cultured circulating tumour cells maintain discrete HER2* and 
HER2~ subpopulations: HER2* circulating tumour cells are more 
proliferative but not addicted to HER2, consistent with activation 
of multiple signalling pathways; HER2~ circulating tumour cells 
show activation of Notch and DNA damage pathways, exhibiting 
resistance to cytotoxic chemotherapy, but sensitivity to Notch 
inhibition. HER2* and HER2~ circulating tumour cells interconvert 
spontaneously, with cells of one phenotype producing daughters 
of the opposite within four cell doublings. Although HER2+ 
and HER2~ circulating tumour cells have comparable tumour 
initiating potential, differential proliferation favours the HER2* 
state, while oxidative stress or cytotoxic chemotherapy enhances 
transition to the HER2~ phenotype. Simultaneous treatment with 
paclitaxel and Notch inhibitors achieves sustained suppression of 
tumorigenesis in orthotopic circulating tumour cell-derived tumour 
models. Together, these results point to distinct yet interconverting 
phenotypes within patient-derived circulating tumour cells, 
contributing to progression of breast cancer and acquisition of drug 
resistance. 

We documented the emergence of HER2* circulating tumour 
cells (CTCs) in patients initially diagnosed with ER-positive/ 
HER2-negative (ER*/HER2_) breast cancer, after multiple courses 
of therapy for recurrent metastatic breast cancer. Using microfluidic 
CTC-iChip purification followed by imaging flow cytometry’, 16 out 
of 19 (84%) patients had HER2* CTCs (Fig. 1a, Extended Data Fig. la 
and Supplementary Table 1). Twenty-two individual CTCs from two 
representative patients (Brx-42, Brx-82) were isolated and subjected to 
single-cell RNA sequencing (scRNA-seq). HER2 expression was 
bimodal in distribution (<1 read per million (RPM) versus median 133, 
range 32-217 RPM; P=7.5 x 10~°) (Fig. 1b), indicating the existence of 
discrete HER2* and HER2~ subpopulations. In these patients, the frac- 
tion of HER2* CTCs increased with disease progression (Extended Data 
Fig. 1b). HER2* CTCs were not restricted to ER*/HER2~ breast cancer: 2 
out of 13 patients with ER~/PR”/HER2° (triple negative) breast cancer also 
had HER2* and HER2~ CTC subpopulations (Extended Data Fig. 1c). 
In ER*/HER2~ breast cancers, immunohistochemical (IHC) staining 


1,2,6 


of patient-matched metastatic tumour biopsies showed increased 
HER2* staining, compared with primary tumours (Fig. 1c). Unlike 
HER2-amplified breast cancer, HER2* tumour cells within metastatic 
lesions did not have evidence of gene amplification (Extended Data 
Fig. 1d). 

The CTC-iChip efficiently captures viable CTCs, enabling deri- 
vation of CTC cultures*. We established CTC lines (Brx-42, Brx-82, 
Brx-142) with discrete HER2*/HER2~ subpopulations comparable 
to patient-matched primary CTCs (Fig. la, d and Extended Data 
Fig. le, f). Acquired HER2 expression was not due to gene amplifica- 
tion, and no distinguishing mutations were identified between HER2* 
and HER2™ subpopulations (Extended Data Fig. 1g and Supplementary 
Table 2). Fluorescence-activated cell sorting (FACS) of HER2* versus 
HER2~ subpopulations showed distinct functional properties: HER2T 
CTCs had a higher proliferation rate (Fig. le), with increased staining 
for the proliferation marker Ki67, but no change in apoptotic markers 
cleaved-caspase 3 or annexin 5 (Extended Data Fig. 2a, b). 

We tested the relative tumorigenicity of HER2* versus HER2~ CTCs 
following injection into the mouse mammary fat pad. Both FACS- 
purified HER2* and HER2~ CTCs generated tumours, with HER2* 
tumours being larger and having a higher frequency of lung metastases 
(Fig. 1f and Extended Data Fig. 2c, d). Despite differences in prolifer- 
ation, limiting dilution studies showed that HER2* and HER2~ CTCs 
initiate tumours from as few as 200 cells, pointing to comparable 
progenitor potential (Extended Data Fig. 2e). 

The coexistence of HER2* and HER2~ CTCs, despite differing pro- 
liferation rates, led us to test whether these subpopulations are capa- 
ble of interconversion. After 4 weeks in culture, FACS-purified green 
fluorescent protein (GFP)-tagged HER2~ CTCs acquired HER2* cells 
(Brx-82: 42%; Brx-142: 46%), while HER2* CTCs generated HER27 
cells at lower efficiency (Brx-82: 5%; Brx-142: 11%) (Fig. 2a, b and 
Extended Data Fig. 3a). By 8 weeks, the parental HER2t/HER2~ com- 
position was nearly re-established (Fig. 2b). This interconversion was 
also evident by mixing equal proportions of GFP*/HER2* and GFP~/ 
HER2~ CTCs, with the emergence of GFP*/HER2~ and GFP /HER2* 
cells, respectively (Extended Data Fig. 3a). 

To better define the timing of HER2*/HER2~ interconversion, we 
established single-cell-derived CTC colonies using HER2-based FACS, 
followed by sequential confocal microscopy. Colonies were scored for 
HER2 and EpCAM expression at 1-, 3-, 5- to 9-, 10- to 19- and >20-cell 
stages. Single HER2~ CTCs initially proliferated slowly (Extended 
Data Fig. 3b), and first acquired HER2* daughter cells at the 5- to 
9-cell stage (6.5%), with rapid interconversion thereafter (10-19 cells: 
47%, >20 cells: 59%; Fig. 2c, d). The more rapidly proliferating 
single HER2* CTCs also generated HER2~ progeny at the 5- to 9-cell 
stage (5%), but the proportion of HER2~ CTCs rose more slowly 
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Figure 1 | Distinct properties of HER2* and HER2~ CTC 
subpopulations from patients with advanced ER*/HER2~ breast 
cancer. a, Quantitation by imaging flow cytometry of HER2* and HER2~ 
CTCs isolated from patients Brx-42, Brx-82. EpCAM (yellow) and HER2 
(green). Scale bar, 101m. b, Bimodal distribution of ERBB2 RNA-seq reads 
from single CTCs, Hartigans’ dip test, P=7.5 x 10°, n=22 (HER2~ <1 
RPM; HER2* > 133, range 32-217). ¢, IHC for HER2 (brown) in matched 
metastatic versus primary tumours (Brx-42, Brx-82, Brx-142) compared 
with HER2-amplified tumour (control). Scale bar, 100}1m; tumour data 
(Supplementary Table 1). d, FACS of cultured CTCs, showing discrete 
HER2* and HER2~ subpopulations. MDA-231 (triple-negative breast 
cancer (TNBC)) and SKBR3 (HER2-amplified) cells are shown as control. 
e, Differential proliferation of FACS-purified HER2* (red) and HER2~ 
(blue) subpopulations from cultured CTCs; two-way analysis of variance 
(ANOVA) P< 0.01 (Brx-82), P< 0.0001 (Brx-142); n =6; s.d. (error bar). 
f, Increased in vivo growth of orthotopic mammary tumours derived from 
FACS-purified, HER2* CTCs compared with HER2~ cells; n= 8; two-way 
ANOVA, P < 0.0001; s.d. (error bar). 


(10-19 cells: 17%, >20 cells: 22%; Fig. 2c, d). Thus, interconversion 
between HER2*/HER2~ phenotypes occurs spontaneously as early as 
four cell doublings. 

Interconversion between HER2* and HER2™ phenotypes was also 
tested in vivo by orthotopic inoculation of FACS-purified cultured 
CTCs. Tumours established from HER2~ CTCs displayed HER2* 
subpopulations, and vice versa (Fig. 2e and Extended Data Fig. 3c). 
In vivo interconversion was confirmed by injecting a 1:1 mixture of 
GFP*t/HER2* and GFP-/HER2~ CTCs (or the converse), followed by 
dual GFP and HER2 IHC. Within mixed tumours, GFP-tagged HER2~ 
CTCs produced GFP*/HER2°* cells (44%), and in separate tumours, 
GFP-tagged HER2* CTCs generated GFP*/HER2~ cells (21%) (Fig. 2f 
and Extended Data Fig. 3d). 
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Figure 2 | Interconversion of HER2* and HER2~ phenotypes. a, FACS- 
purified GFP-tagged HER2* and HER2~ CTCs generate HER2~ (top) 
and HER2* cells (bottom), respectively. b, Time course of HER2*/HER2~ 
interconversion following FACS-isolation of HER2* (red) and HER2~ 
(blue) cells; n = 3; s.d. (error bar). Parental cultured CTCs (black dotted) 
are shown as control. c, Representative confocal microscopic images 
depicting HER2*/HER2°~ interconversion within single-cell-derived 
clones at indicated time points (D, days) and colony sizes. EpCAM (green), 
HER2 (red) and MERGED (gold). Scale bar, 20 jum; n = 20. Arrows and 
dashed boxes indicate interconverting cells, with loss/gain of HER2. 

d, Quantitation of HER2*/HER2~ interconversion from single-cell- 
derived colonies at each colony size; t-test, P< 0.0001; n = 20; s.d. (error 
bar). e, Gain/loss of HER2* cells (brown, arrow) in tumour xenografts 
derived from purified HER2™ (left)/HER2* (right) CTCs. Scale bar, 

100 1m (top); 501m (bottom); n = 8. f, IHC imaging and quantitation of 
GFP*/HER2* cells within tumours generated from GFP-tagged/HER2— 
and untagged HER2* CTCs (top), and the converse (bottom). GFP: 
cytoplasmic red; HER2: membrane brown. Scale bar, 201m; t-test, 

*P< 0.05, ****P < 0.0001; n =6; s.d. (error bar). 


To define the molecular characteristics of HER2* versus HER2— 
CTCs, we quantitatively mapped the global proteomes (>6,300 pro- 
teins) of FACS-purified subpopulations (Brx-42, Brx-82, Brx-142) 
using multiplexed mass spectrometry (MS) with isobaric tandem mass 
tags (TMT)° (Supplementary Table 3). While proteome profiles of indi- 
vidual cell lines were distinct, they shared differences between HER2+* 
and HER2~ subpopulations (Pearson correlation coefficients: Brx-82 
versus Brx-142 = 0.81; Brx-82 versus Brx-42 = 0.71; Brx-42 versus 
Brx-142 = 0.64) (Fig. 3a and Extended Data Fig. 4a, b). HER2* CTCs 
showed enrichment (Pathway Interaction Database (PID)) of receptor 
tyrosine kinase (RTK) and pro-growth signalling (GSEA, false discov- 
ery rate (FDR) < 0.25) (Fig. 3b and Supplementary Tables 3 and 4). 
Phosphotyrosine blots from the HER2* subpopulations confirmed 
RTK phosphorylation (HER2, HER3, HER4, insulin receptor (INSR), 
EPHA1, EPHA2 and EPHA10), which was absent from matched 
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Figure 3 | Molecular pathways differentially activated in HER2~ versus 
HER2* cultured CTCs. a, Comparison of quantitative MS proteomes 
(6,349 proteins) showing distinct profiles for individual cultured CTCs 
(Brx-82, Brx-142), but linear correlation between proteins differentially 
expressed in HER2* and HER2~ subpopulations; NI, normalized 
intensity; m = 2 biological replicates per CTC line Brx-42, Brx-82, Brx-142 
(Supplementary Table 3). b, c, Cytoscape network maps (top) and GSEA 
pathway analysis (bottom) depicting proteins enriched by greater than 
log»(0.5) by quantitative MS in (b) HER2* and (c) HER2~ CTCs (GSEA 
FDR < 0.25; nominal P cut-off < 0.05; Supplementary Table 4). Coloured 
shapes represent proteins within denoted pathways. Red asterisks highlight 
RTK pathways in b, and Notch pathways in c. 


HER2™ CTCs (Extended Data Fig. 4c). sCRNA-seq analysis of 15 pri- 
mary HER2+ CTCs compared with 7 HER2~ CTCs from matched 
patient blood samples showed enrichment for 15 of 32 shared path- 
ways (ERBB1, ERBB2/ERBB3, IGF1, EPHA2, MET) identified by MS 
analysis of cultured CTC lines (Fig. 3b, Extended Data Fig. 4d, e and 
Supplementary Tables 4 and 5). In contrast to HER2* CTCs, MS analysis 
of cultured HER2~ CTCs showed increased expression of proteins 
enriched in Notch (HES/HEY, Presenilin 1 (PS1)) and DNA damage 
pathways (AuroraB, ATM, ATR, Fanconi) (GSEA, FDR < 0.25) (Fig. 3c 
and Supplementary Tables 3 and 4). 

To explore the potential therapeutic significance of pathways dif- 
ferentially activated in HER2* versus HER2~ CTC subpopulations, 
we screened a panel of 55 drugs selected both for clinical relevance 
and for the ability to target MS-identified pathways (Supplementary 
Table 6). HER2* CTCs were no more sensitive to the HER2 inhibitor 
lapatinib than HER2~ CTCs (half-maximum inhibitory concentra- 
tion (ICs9) = 11M), indicating they were not ‘oncogene addicted’ to 
HER2, unlike the HER2-amplified SKBR3 cells (ICs) =5 nM) (Fig. 4a, 
Extended Data Fig. 5a, b). However, dual inhibition of HER2 and 
IGFI1R, another RTK activated in HER2* CTCs, was cytotoxic to 
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HER2* but not HER2~ CTCs (Fig. 4a), suggesting inhibition of mul- 
tiple receptor tyrosine kinases may be effective in treating HER2* 
CTCs. Compared with HER2+ CTCs, HER2~ CTCs showed reduced 
sensitivity to the chemotherapeutic agents docetaxel, doxorubicin 
and 5-fluorouracil (5-FU) (Fig. 4b and Extended Data Fig. 5a, c), but 
increased sensitivity to -secretase inhibitors, which suppress Notch 
activity (Fig. 4b and Extended Data Fig. 5a, d). Despite proteomic 
enrichment for Aurora B signalling, HER2~ CTCs were not differ- 
entially sensitive to Aurora family inhibitors (Fig. 3c, Extended Data 
Fig. 5a and Supplementary Tables 3 and 4). 

The increased NOTCH1 in HER2~ CTCs observed by quantita- 
tive MS and confirmed by western blot (Extended Data Fig. 6a and 
Supplementary Tables 3 and 4) was inversely correlated with HER2 
expression within primary CTCs and CTC lines, shown by scRNA-seq 
and immunostaining (Extended Data Fig. 6a). We therefore tested the 
consequences of suppressing HER2 or activating Notch signalling in 
HER2* CTCs. 

Manipulation of NOTCH] or its downstream effector NFE2L2/NRF2 
in cultured HER2* CTCs did not reduce HER2 expression (Extended 
Data Fig. 6b). However, inhibition of HER2 using lapatinib or short 
interfering RNA (siRNA) led to increased expression of NOTCH1, its 
ligands JAGI and DLL1, and Notch-regulated genes HES1, HEY 1 and 
HEY2 (Fig. 4c and Extended Data Fig. 6c), confirming previous reports 
from HER2-amplified breast cancer cells®’ (Extended Data Fig. 6c). 
Suppression of HER2 also resulted in increased expression of genes 
(GCLC, GGT1, GPX1, GPX4, HMOX1) downstream of Notch-regulated 
NRF2, a transcriptional regulator of anti-oxidant/glutathione metab- 
olism pathways*? (Extended Data Fig. 6d). Thus, expression of HER2 
in CTCs appears to mediate downregulation of the NOTCH1/NRF2 
axis, potentially switching between proliferative and survival-prone 
phenotypes. 

In addition to suppressing HER2 directly, we tested additional stimuli 
capable of modulating the HER2*/HER2~ interconversion. Treatment 
of HER2* CTCs with low doses of docetaxel (1 nM) or induction of 
oxidative stress with hydrogen peroxide (H2O.; 10mM) induced rapid 
shifts from HER2* to HER2~ (30% conversion, >70% survival) (Fig. 4d). 
To exclude differential cell death, we demonstrated acceleration in the 
appearance of HER2~ progeny from FACS-purified single HER2* CTCs 
(5- to 9-cell stage: 45%; >10-cell stage: 62%) (Fig. 4e and Extended Data 
Fig. 6e). Thus, exposure to cytotoxic/oxidative stress mediates a switch 
to a less proliferative but more drug-resistant phenotype. 

To model the potential significance of HER2*/HER2~ intercon- 
version in vivo, we generated orthotopic mammary xenografts from 
FACS-purified subpopulations and analysed tumours before and after 
treatment with paclitaxel. Purified HER2* CTCs generated mixed 
tumours (88% HER2*, 12% HER2~) and showed dramatic tumour 
shrinkage following paclitaxel treatment. The recurrent tumour showed 
a transient reduction in HER2* with a corresponding increase in 
HER2~ composition following chemotherapy (2 weeks: 39% HER2*; 
7 weeks: 74% HER2"; Fig. 4f). Purified HER2~ CTCs also gave rise to 
a mixed tumour (35% HER2*, 65% HER2_), but paclitaxel induced 
only a limited delay in tumour growth with a minimal effect on HER2 
content. Shedding of CTCs was also suppressed by paclitaxel in HER2* 
but not HER2~ tumours (Extended Data Fig. 6f). The chemotherapy- 
induced shift in HER2 composition was also evident following inocu- 
lation of parental CTC cultures (untreated 65% HER2*; post-therapy 
30% HER2*; Extended Data Fig. 6g). Finally, we generated tumours 
from a 1:1 mixture of GFP-tagged HER2* and untagged HER2°~ cells, 
demonstrating a shift from GFP*/HER2* to GFPt/HER2~ cells fol- 
lowing paclitaxel treatment (untreated: 70% GFP+/HER2*, post-ther- 
apy: 42% GFP*/HER2*; Extended Data Fig. 6h). The potent effect of 
chemotherapy on HER2*/HER2~ phenotypes in vivo may reflect both 
reduced drug-sensitivity of HER2™ cells, as well as stress-induced 
HER2* to HER2™ switching. 

Given the demonstrated susceptibility of HER2~ CTCs to Notch 
inhibitors, we combined paclitaxel with either of two +-secretase 
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Figure 4 | Cooperative targeting of HER2* and HER2~ CTC 
subpopulations suppresses tumour growth. a, HER2t CTCs show no 
change in sensitivity to lapatinib alone, compared with matched HER2™ 
CTCs (Brx-142), but have increased sensitivity to combined HER2 and 
IGF1R (BMS-754807) inhibitors; n = 6; s.d. (error bar). b, HER2~ CTCs 
demonstrate reduced chemosensitivity (docetaxel) but have enhanced 
sensitivity to Notch inhibition (BMS-708163, Notchi!), compared with 
HER2* CTCs; n =6; s.d. (error bar). c, Inhibition of HER2 with lapatinib 
or siRNA-mediated knockdown in HER2+ CTCs (Brx-82) results in 
dose-dependent increase of Notch-related genes: NOTCH1, JAG1, DLL1, 
HES1, HEY 1, HEY2; P (t-test) < 0.05; n= 6; s.e.m. (error bar). d, Rapid 
emergence (96 h) of HER2~ CTCs following treatment of HER2* CTCs 
with HO, (10 mM) or docetaxel (1 nM). e, Confocal microscopy showing 
rapid appearance of HER2™ progeny from single-CTC derived HER2+ 


inhibitors (LY-411575; RO4929097) in treating mice with tumours 
initiated from parental CTC lines. Compared with paclitaxel alone, the 
combination therapy significantly delayed onset of tumour recurrence, 
while Notch inhibition alone had no effect on tumour growth (Fig. 4g 
and Extended Data Fig. 6i). 

Taken together, we have used primary and cultured CTCs from 
patients with ERt/HER2~ breast cancer who developed metastatic 
multidrug-resistant disease to show that coexisting distinct HER2* 
and HER2~ tumour cell subpopulations may interconvert, with striking 


colonies treated with H,O2. EpCAM (green), HER2 (red) and MERGED 
(gold). Scale bar, 201m; n= 10. Arrows and dashed boxes indicate 

cells with loss of HER2 at indicated time points. f, Paclitaxel treatment 

(4 weeks) of mice with CTC-derived (Brx-142) orthotopic mammary 
tumours. Top: HER2+/HER2~ tumour growth curves with paclitaxel 
treatment; bottom: representative IHC for HER2 (brown) in HER2*- and 
HER2°~ -derived tumours at the U (untreated), T (2 weeks post-treatment) 
and R (7 weeks post-treatment) time points. Scale bar, 100 1m; t-test, 
*P<0.05, ****P < 0.0001. g, Simultaneous treatment (4 weeks) of 
mammary xenografts (Brx-82) with paclitaxel and either Notch inhibitor 
RO4929097 (Notchi’) or LY-411575 (Notchi*), showing sustained responses 
for the combination, compared with paclitaxel alone. Rx denotes treatment 
duration; two-way ANOVA, P < 0.0001, n=6. 


consequences for disease progression and drug response. The com- 
parable tumour initiating potential and similar expression of stem 
cell marker ALDH1 in HER2* and HER2~ CTCs suggest underlying 
tumour cell plasticity in these advanced patient-derived breast CTC 
lines, rather than a hierarchical cancer stem-cell model as described 
in drug-resistant subpopulations within established breast cancer cell 
lines”!°-!, While expression of NOTCH1 and other embryonic mark- 
ers has been reported in rare, quiescent cells within primary breast 
tumours”!*!8 the NOTCH1+ CTCs reported here constitute a major 
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cell population, exhibiting both persistent cell proliferation in vitro and 
tumorigenesis in vivo. Thus, we propose a dynamic model, in which 
the equilibrium between HER2* and HER2°~ cells within a heteroge- 
neous tumour population is driven by spontaneous interconversion 
between these phenotypes, with the more rapidly proliferating HER2* 
cells prevalent under baseline conditions, and environmental or thera- 
py-induced stress enhancing conversion to the more resistant HER2~ 
phenotype. Neither molecular profiling nor functional studies have 
revealed secreted factors that affect the mutual survival of HER2* and 
HER2~ CTCs, but we cannot exclude such additional factors. 

Finally, the properties of patient-derived CTC lines established after 
multiple courses of therapy provide relevant insight to the treatment 
of drug-refractory, advanced breast cancer. While clinical trials are 
evaluating the efficacy of HER2-targeted therapy in HER2~ breast cancer 
with acquired HER2* CTCs"!°~1, our observations indicate that acqui- 
sition of HER2 does not indicate HER2 oncogene dependence and 
drug susceptibility; instead it constitutes a marker of a proliferative, 
multi-RTK state. Furthermore, the interconversion of chemotherapy- 
sensitive HER2+*/NOTCH1~ and NOTCH inhibitor-sensitive HER2~/ 
NOTCHI1* CTCs suggests that dual treatment, as modelled here, may 
be required for effective treatment. Clinical trials so far have had limited 
success sequentially administering embryonic pathway inhibitors 
targeting Hedgehog, Wnt or Notch to inhibit cancer stem cells following 
initial chemotherapy!**?~>. The rapid interconversion between pro- 
liferative and drug-resistant CTC subpopulations raises the possibility 
that simultaneous combination therapy may provide a novel strategy 
for clinical validation. 


Online Content Methods, along with any additional Extended Data display items and 
Source Data, are available in the online version of the paper; references unique to 
these sections appear only in the online paper. 
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METHODS 


Patient selection and CTC isolation. Patients with a diagnosis of metastatic breast 
cancer provided informed consent for de-identified blood collection, as per insti- 
tutional review board approved protocol (DF/HCC 05-300). Enrolled patients 
had received multiple courses of therapy, which is typical in advanced ER* breast 
cancer, and we did not have sufficient power in this pilot study to enable a statis- 
tically significant correlation between the number of therapeutic interventions 
and the frequency of HER2* CTCs. Patient-matched primary and metastatic 
tumour specimens were collected according to institutional review board approved 
protocol (2002-P-002059), and relevant tumour source data are provided in 
Supplementary Table 1. 

Single CTCs were isolated from fresh whole blood by depleting leukocytes using 
the microfluidic CTC-iChip as previously described’. Briefly, whole blood samples 
were incubated with biotinylated antibodies against CD45 (R&D Systems, clone 
2D1), CD66b (AbD Serotec, clone 80H3) and CD16 (BD, clone 3G8) followed 
by incubation with Dynabeads MyOne Streptavidin T1 (Invitrogen) to achieve 
magnetic labelling of white blood cells. This mixture was processed through the 
CTC-iChip, and the CTCs were stained in solution with Alexa 488-conjugated 
antibodies against EpCAM (Cell Signaling Technology, clone VU1D9) and HER2 
(Cell Signaling Technology, clone 29D8 or Janssen R&D) and identified by imaging 
flow cytometry (Amnis). Individual CTCs were picked after staining as described 
above, and PE-CF594-conjugated antibody against CD45 (BD Biosciences, clone 
HI30) was included to stain contaminating leukocytes. CTCs were individually 
micromanipulated using a 101m transfer tip on an Eppendorf TransferMan NK 
2 micromanipulator, transferred into PCR tubes containing RNA protective lysis 
buffer, and flash frozen in liquid nitrogen as previously described”°. Standard CTC 
enumeration of fixed samples is performed on the Bio View high content imag- 
ing system following Megafunnel fixation and staining with the combination of 
wide spectrum cytokeratin (Abcam, ab9377), Ep>CAM (Cell Signaling Technology, 
clone VU1D9), EGER (Cell Signaling Technology, clone D38B1) and HER2 (Cell 
Signaling Technology, clone 29D8) antibodies. 

For mouse xenograft studies, blood was collected via cardiac puncture and 

~1 ml of blood was processed through the microfluidic CTC iChip. CTCs were enu- 
merated on the BioView imaging system after staining with Alexa 488-conjugated 
antibodies against EpCAM (Cell Signaling Technology, clone VU1D9), HER2 
(Janssen R&D or Cell Signaling Technology, clone 29D8) and GFP (ab13970) 
followed by secondary antibodies conjugated with Alexa-488 (Invitrogen). 
Immunohistochemistry. Tissues were sectioned, and slides were incubated in 
0.3% hydrogen peroxide in methanol for 20 min to block endogenous peroxi- 
dase activity. Tissues were permeabilized, and antigen retrieval was performed in 
1x citrate buffer (pH 6) for 15 min. Slides were washed and blocked for 30 min 
with 5% goat serum. Primary HER2 (Cell Signaling, 29D8) or GFP (Living Colours 
AV 632381) antibodies were diluted 1:75 or 1:250 in DAKO antibody diluent and 
samples were incubated for 1h at room temperature. Slides were incubated with 
HRP anti-rabbit antibody (EnVision + DAKO) for 30 min. After washing with 
PBS, the peroxidase reaction was performed with 3,3’-diaminobenzidine (DAB) 
from Vector Laboratories for 10 min. Cells were counterstained with Gill's #2 
haematoxylin for 10-15 s, dehydrated with ethanol and cleared with xylene before 
mounting. Images represent at least five independent fields from six to eight 
xenograft tumours per condition. 
Fluorescence in situ hybridization. Fluorescence in situ hybridization was per- 
formed as described previously”””*. Briefly, 5-\1m sections of formalin-fixed, 
paraffin-embedded tumour samples were de-paraffinized, hydrated and pre- 
treated with 0.1% pepsin for 1-2h. Slides were then washed in 2x saline-sodium 
citrate buffer (SSC), dehydrated, air dried and co-denatured at 80°C for 5min 
with a mixture of CEP17 and HER2 probes and hybridized at 40°C overnight 
using the Hybrite Hybridization System (Abbott). Two-minute post-hybridization 
washes were performed in 2 x SSC/0.3%NP40 at 72°C followed by a 1 min wash in 
2x SSC at room temperature. Slides were mounted with Vectashield containing 
4',6-diamidino-2-phenylindole (Vector, Burlingame, California, USA). Entire 
sections were observed with an Olympus BX61 fluorescent microscope equipped 
with a charge-coupled device camera and analysed with Cytovision software 
(Applied Imaging, Santa Clara, California). 

The HER2 and CEP17 signals were quantified in 50 randomly selected, 
non-overlapping nuclei, and mean numbers of HER2 and CEP17 copies per 
nucleus were calculated. HER2 was considered amplified when the HER2:CEP17 
ratio was >2.0 or HER2 signals per nuclei was >6 following the guidelines of the 
American Society of Clinical Oncology/College of American Pathologists”’. The 
probes used in this study consisted of centromeric CEP: 17p11.1-q11.1, spectrum 
aqua (Abbott Molecular, Des Plaines, Illinois) and locus-specific identifier probes 
derived from bacterial artificial chromosome RP11-94L15 (17q12-17q21.1, 
spectrum orange probe (CHORI, Oakland, California)). 
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CTC cell culture. CTC cultures were grown in suspension in ultra-low attach- 
ment plates (Corning) in tumour sphere medium (RPMI-1640, EGF (20 ng/ml), 
bFGF (20 ng/ml), 1X B27, 1X antibiotic/antimycotic (Life Technologies)) under 
hypoxic (4% O2) conditions. The breast CTC lines, Brx-42, Brx-82 and Brx-142, 
were derived from CTCs isolated using the CTC-iChip as previously described’. 
CTC lines were routinely checked for mycoplasma, using a mycoplasma detection 
kit (MycoAlert, Lonza), and were authenticated by RNA-seq, MS and DNA-seq 
(1,000 gene mutation panel). 

Fluorescence-activated cell sorting (FACS). Cells were trypsinized into single- 
cell suspensions, resuspended in Hanks’ balanced salt solution (HBSS), and incu- 
bated with Anti- HER2/NEU APC (BD, clone 42 c-erbB-2), Anti- HER2 FITC 
(Janssen R & D) or Annexin V FITC (BD, clone RUO) antibodies for 20 min at 
4°C. Unbound antibodies were washed from cells using HBSS. For analytical flow, 
cells were fixed with 3% paraformaldehyde and analysed using a Laser BD Fortessa 
instrument. For sterile live-cell flow cytometry, cells were sorted using a Laser BD 
FACS Aria Fusion Cell Sorter, BSL2*. FACS plots are representative of at least two 
independent experiments performed within 6 months of culture initiation (Figs 1d 
and 2a and Extended Data Figs 1f and 3a). 

Sequencing analysis of genomic DNA. Genomic DNA extracted from CTC- 
derived cell lines was sequenced using a multiplex polymerase chain reaction 
(PCR) technology called Anchored Multiplex PCR (AMP) for single nucleotide 
variant (SNV) and insertion/deletion (indel) detection using next generation 
sequencing (NGS) as previously described”. Briefly, genomic DNA was isolated 
from cell lines and then sheared with the Covaris M220 instrument, followed by 
end-repair, adenylation and ligation with an adaptor. A sequencing library targeting 
hotspots and exons in 39 commonly mutated, cancer-associated genes was gener- 
ated using two hemi-nested PCR reactions. Illumina MiSeq 2 x 151 base paired- 
end sequencing results were aligned to the hg19 human genome reference using 
BWA-MEM™". MuTect™ and a laboratory-developed insertion/deletion analysis 
algorithm were used for SNV and indel variant detection, respectively. This assay 
has been validated to detect SNV and indel variants at 5% allelic frequency or 
higher in target regions with sufficient read coverage. 

Lentivirus production, infection and siRNA knockdown of CTC cell lines. To 
produce replication-incompetent lentivirus, 293T cells were co-transfected with 
either Lenti-Luc-GFP or Notch intracellular domain-pcw107 (Addgene 64621) 
constructs in combination with REV, VSVG, PDML or pMD2.G and psPAX2 
(Addgene) using Lipofectamine Plus reagent (Invitrogen). Twenty-four hours 
later, growth medium was replenished. Viral supernatants were harvested 48h 
post-transfection, concentrated with Lenti-X Concentrator (Clontech), and viral 
pellets were resuspended in 40011 base medium. CTC cultures were infected over- 
night with 100 11 lentivirus in 6}1g/ml Polybrene. Puromycin (3 j1g/ml) was used 
to select transduced cells over a period of 7 days. For the RNAi knockdown, CTC 
lines Brx-42, Brx-82 and Brx-142 were reverse transfected in ultra-low attachment 
six-well plates (Corning) with 25nM siRNA smart pools (Dhamacon) con- 
taining the combination of four different siRNA oligonucleotides for ERBB2/ 
HER2 (GGACGAAUUCUGCACAAUG; GACGAAUUCUGCACAAUGG; 
CUACAACACAGACACGUUU; AGACGAAGCAUACGUGAUG), NOTCH1 
(GCGACAAGGUGUUGACGUU; GAUGCGAGAUCGACGUCAA; 
GAACGGGGCUAACAAAGAU; GCAAGGACCACUUCAGCGA), NRE2L2 
(GAGAAAGAAUUGCCUGUAA, CCAAAGAGCAGUUCAAUGA, 
UAAAGUGGCUGCUCAGAAU; UGACAGAAGUUGACAAUUA) or the 
negative control gene GAPDH. siRNA pools for target genes were decon- 
volved to demonstrate targeted knockdown efficiency (more than two siRNAs 
per gene). 

Immunofluorescence. CTC lines were spun onto poly-L-lysine-functionalized 
glass slides with Spintrap, fixed with 3% paraformaldehyde, permeabilized with 
0.1% Triton X and stained with nuclear 4,6-diamidino-2-phenylindole (DAPI) 
stain, HER2 (Cell Signaling Technologies, clone 29D8), Ki67 (Zymed), Cleaved 
Caspase-3 (Cell Signaling Technologies, clone D3E9) and/or NOTCH1 (Cell 
Signaling Technologies, clone DIE11) antibodies. Secondary antibodies were 
conjugated to either Alexa Fluor 488 or Alexa Fluor 594 (Life Technologies), and 
fluorescence was measured using the Nikon 90-I fluorescent microscope. Images 
are representative of at least three independent images per sample. 

Single-cell lineage tracing and confocal microscopy. Single HER2* or HER2~ 
CTCs were flow sorted in 96-well white-walled plates (Corning) using Laser BD 
FACS Aria Fusion Cell Sorter, BSL2*. Single cell, 1-, 3-, 5- to 9-, 10- to 20- and > 
20-cell clones were analysed for heterogeneity in HER2 expression via staining 
with antibodies against E>CAM (FITC labelled; Cell Signaling, clone VU1D9) and 
HER2 (APC labelled, BD, clone 42 c-erbB-2). Imaging and image processing was 
performed sequentially with the confocal microscope (Zeiss 710 Laser Scanning 
Confocal) followed by FIJI (Image J). Images are representative of at least 20 inde- 
pendent images per colony size. 
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Determination of reads-per-million (RPM). Trimmomatic was used to crop reads 
lengths to 50 nucleotides, and to remove the TruSeq3-PE-2 Illumina adapters. The 
paired-end reads were then aligned using tophat2 and bowtiel with the no-novel- 
juncs argument set with human genome version hg19 and transcriptome defined 
by the hg19 genes.gtf table from http://genome.ucsc.edu. Reads that did not align 
or aligned to multiple locations were discarded. The number of reads aligning to 
each gene was then determined using htseq-count. Samples that had fewer than 
10° reads were discarded. The read count for each gene was divided by the total 
counts assigned to all genes and multiplied by one million to form the reads per 
million (RPM). Samples for which the expression of the white blood cell marker 
PTPRC (CD45) was greater than 10 RPM were discarded. Single-cell RNA-seq data 
have been deposited in the Gene Expression Omnibus under accession number 
GSE75367. 

Bimodality. To establish that the distribution of HER2 expression in CTCs is 
multi-modal, we applied the Hartigans’ dip test as implemented in the diptest 
R-package to the logio(RPM + 1) values with 10 RPM as the threshold to define 
HER2~> versus HER2? CTCs. To establish that the distribution has two modes 
and not more, we applied the density function of R with default values to the 
logio(RPM + 1) values. 

Gene set enrichment analysis of RNA-seq and quantitative proteomics data. 
On the basis of the analysis of bimodality above, we defined HER2* samples to be 
those for which the expression of HER2 exceeded 10 RPM and defined the rest to 
be HER2-. For the mass spectrometric data, enrichment of signalling pathways 
was determined by submitting the average log, fold-change in protein abundance 
between the HER2-high and HER2-low samples to the pre-ranked function of 
the Broad Institute's GSEA software using gene sets in the Pathway Interaction 
Database (PID) and KEGG as curated in version 4 of the Broad Institute’s MSigDB 
(http://www.broadinstitute.org/gsea/msigdb/). Pathway enrichment for the RNA- 
seq of the CTCs was done the same way with the exception that the full RPM 
matrix for the CTCs and the HER2* versus HER2~ distinction was input to the 
GSEA software instead of log, fold-change. 

Quantitative proteomics. CTC cell pellets were re-suspended in lysis buffer con- 
taining 75 mM NaCl, 50mM HEPES (pH 8.5), 10 mM sodium pyrophosphate, 
10mM NaF, 10 mM 8-glycerophosphate, 10 mM sodium orthovanadate, 10 mM 
phenylmethanesulfonylfluoride, Roche Complete Protease Inhibitor EDTA-free 
tablets and 3% sodium dodecyl sulfate. Cells were lysed by passing them ten times 
through a 21-gauge needle, and the lyses were prepared for analysis on the mass 
spectrometer essentially as described previously°. Briefly, reduction and thiol 
alkylation were followed by purifying the proteins using MeOH/CHCI3 precipitation. 
Protein digest was performed with Lys-C and trypsin, and peptides were labelled 
with TMT-10plex reagents (Thermo Scientific)** and fractionated by basic pH 
reversed phase chromatography. Multiplexed quantitative proteomics was per- 
formed on an Orbitrap Fusion mass spectrometer (Thermo Scientific) using a 
simultaneous precursor selection (SPS)-based MS3 method™!. MS2 spectra were 
assigned using a SEQUEST-based proteomics analysis platform*>. On the basis of 
the target-decoy database search strategy*® and employing linear discriminant 
analysis and posterior error histogram sorting, peptide and protein assignments 
were filtered to a FDR of < 1% (ref. 35). Peptides with sequences that were con- 
tained in more than one protein sequence from the UniProt database were assigned 
to the protein with most matching peptides**. TMT reporter ion intensities were 
extracted as that of the most intense ion within a 0.03-thomson window around 
the predicted reporter ion intensities in the collected MS3 spectra. Only MS3 with 
an average signal-to-noise value larger than 40 per reporter ion as well as with an 
isolation specificity” larger than 0.75 were considered for quantification. A two-step 
normalization of the protein TMT-intensities was performed by first normalizing 
the protein intensities over all acquired TMT channels for each protein on the 
basis of the median average protein intensity calculated for all proteins. To correct 
for slight mixing errors of the peptide mixture from each sample, a median of the 
normalized intensities was calculated from all protein intensities in each TMT 
channel, and protein intensities were normalized to the median value of these 
median intensities. 


Protein interactions were extracted from the String database (high confidence 

score > 0.7)°”. Overlapping proteins were assigned to the pathway with the greatest 
number of proteins, and enriched PID pathways were ranked by logio(P value) to 
the nearest thousandth. Mass spectrometry raw data have been deposited in the 
MassIVE proteomics data repository under the accession number MSV000079419. 
Drug screens. Drugs were obtained from the MGH Center for Molecular 
Therapeutics and are listed in Supplementary Table 6. They were chosen because 
of their common clinical use for treatment of breast cancer or unique targeting of 
epigenetic/stem cell pathways. One thousand cells were seeded in tumour sphere 
media in 384-well ultra-low attachment plates in triplicate wells on duplicate plates 
24h before the addition of drugs. Three independent drug concentrations centred 
on the reported ICs» were used (Supplementary Table 6). Cell viability was assayed 
6 days after drug treatment with CellTiter-Glo (Promega) and was normalized to 
corresponding untreated controls**. 
Mouse xenograft assays and drug treatment. In compliance with ethical regulations 
and approved by the animal protocol (IACUC 2010N000006), 6-week-old 
female NSG (NOD. Cg-Prkscsdid Il2rgtm1Wjl/SzJ) mice from Jackson 
Laboratories were anaesthetized with isofluorane, and GFP-LUC labelled CTCs 
(200,000, 20,000 and/or limiting dilutions as low as 200 cells) or 50:50 mixed 
CTCs (GFP-LUC*/HER2*: Untagged/HER2_, and the converse) were injected 
into the fourth right mammary fat pad. A 90-day release 0.72 mg oestrogen pellet 
(Innovative Research of America) was implanted subcutaneously behind the neck of 
each mouse. Tumour growth was monitored weekly by in vivo imaging using IVIS 
Lumina II (PerkinElmer) following intraperitoneal injection (15011 per animal) 
of p-luciferin substrate (Sigma). For in vivo drug sensitivity testing, Paclitaxel 
(10 mg/kg) was administered weekly by intravenous injection for 4 consecutive 
weeks. Notch inhibitors (Notchi*) LY-411575 (10 mg/kg) or (Notchi*) RO429097 
(10 mg/kg) were administered daily (5 days on/2 days off) via oral gavage in 2% 
solvent (2% sodium caroboxymethy] cellulose) for 4 consecutive weeks. No animal 
randomization or blinding was used for these mouse studies. All animal studies 
used six to eight mice per condition to ensure sufficient statistical power. 
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Extended Data Figure 1 | See next page for caption. 
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Extended Data Figure 1 | Patients with advanced ER*/HER2~ breast 
cancer harbour discrete HER2* and HER2~ subpopulations. 

a, CTCs freshly isolated from 19 patients with ER*/HER2~ breast cancer 
were stained with HER2 (green) and EpCAM (yellow) and imaged 

using imaging flow cytometry. Bar graph shows the number of HER2+ 
(black) and HER2~ (white) CTCs (median 22% HER2* CTCs, range 
4-58%). Supplementary Table 1 provides HER2*/HER2~ ratios and each 
patient’s clinical history. b, scRNA-seq for ERBB2 expression at 

multiple time-points showing acquisition of HER2+ CTCs (Brx-82, 
Brx-42) over the course of progressive disease. Single asterisk (*) denotes 
patient expiration. Rx, sacituzumab (IMMU- 132); Rx;, vinorelbine 

+ trastuzumab; Rx), eribulin. c, Distinct HER2* and HER2~ CTCs from 
13 patients with triple-negative breast cancer (TNBC) determined by 
scRNA-seq (HER2~ < 0 RPM; HER2* > 153, range 33-463). d, HER2 
fluorescence in situ hybridization (FISH) analysis of metastatic tumours 


from patients, Brx-42, Brx-82 and Brx-142, shows no amplification of 
ERBB2 compared with HER2-amplified control (Supplementary Table 1 
for tumour source data). HER2 (red); chromosome enumeration probe 
17 (CEP17) (cyan); scale bar, 10 xm. Representative images from five 
independent fields are shown. e, Bright field and immunofluorescence 
(DAPI, blue; HER2, green) images of CTC lines, Brx-42, Brx-82 and 
Brx-142, demonstrate heterogeneity in HER2 expression. Scale bar, 100 1m 
(bright field); 20 j1m (immunofluorescence). Representative images from 
three independent fields are shown. f, FACS analysis shows two distinct 
HER2* and HER2~ subpopulations in the CTC line Brx-42 (at initiation) 
compared with HER2™ control. Representative data of two independent 
experiments are shown. g, HER2 FISH analysis of the HER2* and HER2— 
subpopulations from CTC lines Brx-42, Brx-82 and Brx-142 shows that 
ERBB2 is not amplified. HER2-amplified SKBR3 cells shown as control. 
HER2 (red); CEP17 (green); scale bar, 101m. Representative images from 
five independent fields are shown. 
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Extended Data Figure 2 | HER2* and HER2~ subpopulations exhibit 
distinct functional properties. a, Increased expression of the proliferation 
marker Ki67 (red) in the HER2* subpopulation of CTC line Brx-142 
(t-test, P< 0.0001), compared with the HER2~ subpopulation, with no 
change in cleaved-caspase 3 (red). HER2* cells (green); scale bar, 20 pm. 
Representative images from five independent fields are shown. b, FACS 
analysis for the apoptotic marker Annexin V-FITC shows no difference 
in apoptosis between the HER2* and HER2™ subpopulations of FACS- 


LETTER 


Cc 
Brx-82 
5*107) HERD (-) 
x 4xtp7| “HERZ (+) 
= -~- Parental 
LL 
< 3x10" 
2 
© 2x107 
o 
1*10” 
0 
0 2 4 6 8 10 12 14 16 
Weeks 
d 
Metastatic Frequency from Orthotopic Injections 
CTC Cell Line | HER2(+) | HER2(-) P-value 
Brx-82 6/8 2/8 0.05 
e 


Tumor Initiation from 200 Cells 
CTC Cell Line | HER2(+) HER2(-) 


Brx-82 8 


/8 8/8 NS 


purified CTC line Brx-142. Representative data from two independent 
experiments are shown. c, Tumours initiated by HER2* or HER2> CTCs 
(Brx-82: 200,000 cells) orthotopically injected into the mammary fat 

pad show differential growth rates; n = 8. d, Metastatic frequency of 
HER2* and HER2~ cultured CTCs (Brx-82: P= 0.05; Brx-142: P=0.009) 
following orthotopic injection; n= 8. e, Limiting dilution experiments 
demonstrate comparable tumour initiating ability from 200 HER2* and 
HER2> cultured CTCs (Brx-82, Brx-142);n=8. 
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Extended Data Figure 3 | Dynamics of HER2* and HER2— 
interconversion. a, FACS-purified HER2* and HER2~ subpopulations 
from CTC line Brx-82 were monitored over 28 days to determine shifts 
in the composition of sorted populations. Representative data of two 
independent experiments are shown. b, Growth curves for HER2* (red) 
and HER2~ (blue) FACS-purified single cell clones from CTC line 
Brx-142; two-way ANOVA, P< 0.0001; n= 20. c, IHC HER2 staining of 
tumour xenografts derived from unlabelled HER2~ and HER2* CTCs 
showing acquisition/loss of HER2 (brown), respectively. Arrows indicate 


Brx-82 


controls 


regions of HER2 acquisition/loss. Representative image from at least five 
independent fields; n = 8. ERt/HER2~ and HER2-amplified breast cancers 
are shown below as controls. d, Low-magnification (landscape) view of 
HER2 IHC staining of tumour xenografts derived from mixed HER2* and 
HER2~ CTC cultures containing either GFP-tagged HER2*/HER2~ cells 
(high magnification images are shown in Fig. 2f). Top: representative GFP- 
tagged HER2°~ cells give rise to GFP*/HER2* cells (GFP: cytoplasmic red 
stain, HER2: cell surface brown stain). Bottom: GFP-tagged HER2* cells 
produce GFP*/HER2° cells. Scale bar, 100 pm. 
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Extended Data Figure 4 | Proteomic and scRNA-seq analysis of HER2* 
versus HER2~ cells. a, b, MS-based whole cell proteome profiles (6,349 
proteins) comparing HER2* and HER2~ populations from CTC lines 
(Brx-42, Brx-82, Brx-142). Matched HER2* versus HER2~ proteomic 
differences show significant linear correlation (Pearson correlation 
coefficient = 0.71 between Brx-82 and Brx-42; Pearson correlation 
coefficient = 0.64 between Brx-142 and Brx-42); NI, normalized intensity; 
n=2 per cell line are shown. c, Phospho-RTK array of HER2* and 
HER2~ populations of CTC cell lines Brx-142 and Brx-82 show increased 


phosphorylation of RTKs in the HER2* population. Numbers denote the 
following: 1, HER2; 2, HER3; 3, HER4; 4, INSR; 5, EPHA1; 6, EPHA2; 

7, EPHA1O. Representative data from two independent experiments are 
shown. d, Volcano plot depicts genes enriched in HER2* (red) and HER2~ 
(blue) individual CTCs isolated from patients Brx-42 and Brx-82 and 
analysed by scRNA-seq; n = 22. e, Venn diagram showing PID pathway 
overlap of genes and proteins derived from scRNA-seq (Brx-42, Brx-82) 
and quantitative proteomics of HER2* CTCs, respectively. 
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Extended Data Figure 5 | Fifty-five panel drug screen shows differential 
drug sensitivities exhibited by HER2* versus HER2~ subpopulations. 
a, Heat map showing percentage cell viability (represented as decimal) 
after 6 days of drug treatment of the HER2* and HER2™ subpopulations 
derived from CTC lines Brx-142 and Brx-82. Red and blue represent high 
and low drug sensitivities, respectively; n = 6. b, Lapatinib sensitivity 

of HER2* (red) and HER2™ (blue) subpopulations of CTC line Brx-82. 
MDA-231 (TNBC) and SKBR3 (HER2-amplified) are shown as controls. 


c, Chemosensitivity of HER2+ (red) and HER2~ (blue) subpopulations 
of CTC line Brx-142. MDA-231 (blue) and SKBR3 (red) are shown as 
controls. d, Sensitivity of HER2* (red) and HER2~ (blue) subpopulations 
of CTC line Brx-142 to Notch inhibition with Notchi! (BMS-708163) and 
Notchi? (RO4929097). MDA-231 and SKBR3 cells are shown as controls. 
a-d, Representative of at least two independent experiments for each 
condition; n=6. 
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Extended Data Figure 6 | See next page for caption. 
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Extended Data Figure 6 | NOTCH1 expression and activity in HER2— 
CTCs. a, Western blot analysis of HER2* and HER2™ subpopulations from 
CTC lines Brx-142 and Brx-82 show increased NOTCH 1 in HER2° cells. 
B-Actin is shown as control. Immunofluorescence analysis and sCRNA-seq 
of NOTCH1 (red) and HER2 (green) shows inversely correlated expression 
in CTC lines (Brx-142, Brx-82). b, Ectopic expression of constitutively 
active Notch intracellular domain (ICD) or NREF2 results in increased 
expression of the Notch1 ligand JAG1 but does not alter HER2 expression. 
Representative data of two independent experiments are shown; s.e.m. 
(error bars). c, ssRNA-mediated inhibition of HER2 in Brx-42 HER2* 
CTCs, and lapatinib-mediated inhibition of HER2 in SKBR3 cells results 
in dose-dependent increases in the expression of genes involved in Notch 
signalling (NOTCH1, JAG1, DLL1, HES1, HEY1, HEY2). Representative 
data of two independent experiments are shown; s.e.m. (error bars). 

d, Inhibition of HER2 using lapatinib or siRNA knockdown in Brx-82 
HER2* CTCs increases the expression of NRF2-driven cytoprotective 
genes downstream of the Notch pathway. Representative data of two 
independent experiments are shown; s.e.m. (error bars). e, Quantitation 
of the interconversion of HER2* cells from single-cell clones into 5- to 
9-cell and >10-cell clusters following treatment with 10mM H,0;; t-test, 
P<0.05; n= 10. f, Paclitaxel treatment of mice with tumours derived 


from Brx-142 FACS-purified HER2* CTCs, demonstrating a reduction 

in CTCs, and HER2~ CTCs with no change in counts; t-test P< 0.05; NS, 
not significant. g, Paclitaxel treatment of mice with mammary xenografts 
derived from parental CTC line Brx-142 showing initial tumour response, 
followed by recurrent tumour growth. IHC analysis and quantitation of 
the recurrent tumour shows greatly reduced HER2* (brown stain) cell 
composition in the Paclitaxel drug treated (T, 3 weeks post-treatment) 
tumour compared with the untreated tumour U, and the recovered 
tumour (R, 5 weeks post-treatment). Bar indicates duration of drug 
treatment (Rx). Scale bar, 100 1m; two-way ANOVA, P< 0.0001; n=6. 
Representative images from five independent fields per tumour are shown 
and quantified; t-test, P< 0.001. h, Dual GFP (red, cytoplasmic stain) 

and HER2 (brown, cell surface stain) IHC of tumour xenografts derived 
from mixed GFP-tagged HER2* and untagged HER2~ CTC cultures 
demonstrating enhanced conversion from GFP+/HER2* to GFP+/HER2~ 
after 4 weeks of paclitaxel treatment; t-test, P< 0.0001; n =6. Scale bar, 
100 1m. Arrows indicate interconverting cells. Representative images from 
five independent fields per tumour are shown. i, Mouse tumour xenografts 
derived from the CTC line Brx-142 treated with a combination of the 
Notchi? (LY-414575) and paclitaxel shows diminished tumour relapse; 
n= 6. Bar indicates treatment duration. 
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An endosomal tether undergoes an entropic collapse 


to bring vesicles together 


David H. Murray!*, Marcus Jahnel!?3*, Janelle Lauer!, Mario J. Avellaneda!*+, Nicolas Brouilly!, Alice Cezanne’, 


1 


Hernan Morales-Navarrete!, Enrico D. Perini’, Charles Ferguson’, Andrei N. Lupas°, Yannis Kalaidzidis!, Robert G. Parton*®, 


Stephan W. Grill!?? & Marino Zerial! 


An early step in intracellular transport is the selective recognition 
of a vesicle by its appropriate target membrane, a process regulated 
by Rab GTPases via the recruitment of tethering effectors!*. 
Membrane tethering confers higher selectivity and efficiency 
to membrane fusion than the pairing of SNAREs (soluble 
N-ethylmaleimide-sensitive factor attachment protein receptors) 
alone>-’. Here we address the mechanism whereby a tethered 
vesicle comes closer towards its target membrane for fusion by 
reconstituting an endosomal asymmetric tethering machinery 
consisting of the dimeric coiled-coil protein EEA1 (refs 6, 7) 
recruited to phosphatidylinositol 3-phosphate membranes and 
binding vesicles harbouring Rab5. Surprisingly, structural analysis 
reveals that Rab5:GTP induces an allosteric conformational change 
in EEA1, from extended to flexible and collapsed. Through dynamic 
analysis by optical tweezers, we confirm that EEA1 captures a 
vesicle at a distance corresponding to its extended conformation, 
and directly measure its flexibility and the forces induced during 
the tethering reaction. Expression of engineered EEA1 variants 
defective in the conformational change induce prominent clusters 
of tethered vesicles in vivo. Our results suggest a new mechanism in 
which Rab5 induces a change in flexibility of EEA1, generating an 
entropic collapse force that pulls the captured vesicle towards the 
target membrane to initiate docking and fusion. 

EEA1, as nearly all putative coiled-coil tethering proteins, extends 
more than ten times the length of SNARE proteins®*”. To explain how 
such a long molecule can mediate membrane tethering but also allow 
the membranes to come closer for fusion, we reconstituted a mini- 
mal asymmetric membrane tethering in liposomes containing EEA1, 
Rab5 and different fluorescent tracers (Fig. la and Extended Data 
Fig. 1lb-e). EEA1 binds to phosphatidylinositol 3-phosphate (P1(3)P) 
via its carboxy (C) terminus with high affinity (dissociation constant 
Kax50nM)”!°-!?, and to Rab5:GTP via its amino (N) terminus with 
comparatively lower affinity (Ka~ 2.4{1M)'%. Liposomes containing 
PI(3)P and labelled with RhoDPPE effectively recruited EEA1 and teth- 
ered to DiD-labelled Rab5-6 x His-liposomes, as analysed by confocal 
microscopy (Fig. la—c). The reaction required EEA1, Rab5 and GTP-1S, 
as no co-localization was observed in the presence of GDP. The effi- 
ciency of tethering approached that of biotin-streptavidin liposomes 
(Fig. 1d). Furthermore, no co-localization was observed between pairs 
of liposomes harbouring Rab5 (Fig. le). Therefore, Rab5, EEA1 and 
PI(3)P form a minimal endosomal asymmetric membrane tethering 
machinery. 

In principle, the N terminus of EEA1 could also bind Rab5 in cis: 
that is, on the same membrane. However, the presence of Rab5 on 
both pairs of liposomes, as in early endosomes in vivo, did not inter- 
fere with the tethering activity of EEA1 in vitro, as tethering was 


indistinguishable between the asymmetric and symmetric conditions 
(Fig. 1c, e). Moreover, coiled-coil prediction algorithms estimate a cen- 
tral segment of nearly ~200 nm (refs 14, 15) (Extended Data Fig. 1a), 
suggesting that the molecule adopts an extended conformation. Indeed, 
filamentous EEA 1-positive structures emanating from the surface of 
early endosomes in vivo have been observed by electron microscopy". 
In further support of this interpretation, we visualized the N and C 
termini of EEA1 using specific antibodies by super-resolution micros- 
copy in HeLa cells (Fig. 1f, g, Extended Data Fig. 1f-h and Methods). 
Ifthe N terminus of EEA1 bound Rab5 in cis, it should co-localize with 
the C terminus. Strikingly, the ends of EEA1 could instead be resolved, 
with the N terminus extending radially from the C terminus into the 
cytoplasm. We estimated an end-to-end of distance of 141+47nm 
(mean + s.d.; Fig. 1h), in the range of the predicted length and rigidity 
of coiled-coils. 

To characterize the distances and dynamics of the tethering reaction, 
we generated bead-supported membranes (101m silica microspheres) 
harbouring green fluorescent protein (GFP)-Rab5 (Fig. liand Extended 
Data Fig. 2). These tethered to liposomes containing PI(3)P in the pres- 
ence of GTP-4S but not GDP in an EEA1 concentration-dependent 
manner (Extended Data Fig. 2g, h). Time-lapse microscopy showed 
that some liposomes were captured by the bead-supported membrane, 
while others diffused away (Extended Data Fig. 2iand Supplementary 
Videos 1 and 2), similar to the behaviour of endosomes in vivol®. We 
next measured the distances between the tethered vesicle and GFP- 
Rab5 (Fig. 1j, Extended Data Fig. 2) and Methods). Surprisingly, 
we observed distances ranging from 20nm up to approximately 
the predicted length of 200 nm (mean +s.d.; 84 + 56 nm) (Fig. 1k). 
Such a broad distribution is irreconcilable with the predicted length of 
EEAI and suggests that EEA] may change its conformation. 

We determined the conformation of EEA1 using rotary shadowing 
electron microscopy and image analysis (Fig. 2a). The measurements of 
contour length and mean end-to-end distance followed Gaussian distri- 
butions with an average of 222 + 26 nm (Fig. 2b, top) and 195 +26nm 
(Fig. 2b, bottom), respectively, confirming that the molecule is largely 
extended, as in vivo'! (Fig. 1g, h). However, this is incompatible with 
the much shorter distances between tethered vesicles in vitro (Fig. 1k). 
Therefore, we asked whether binding to Rab5 may cause EEA1 to 
adopt a more compact conformation. Remarkably, this was the case. 
Addition of Rab5:GTP-1S (Fig. 2c) resulted in a significant fraction 
of bent EEA1 molecules having a substantially reduced end-to-end 
distance of 122 +50 nm (Fig. 2d). 

To gain further insights into this mechanism, we generated two 
mutants with alterations in the coiled-coil but retaining the Rab5- 
and PI(3)P-binding domains (Extended Data Fig. 3 and Methods). 
In the extended EEA1 mutant, we removed regions of discontinuity 
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Figure 1 | EEA1, Rab5 and PI(3)P form an asymmetric tethering 
machinery. a, b, Vesicle—vesicle tethering assay. Rho-DPPE liposomes 
harbouring Rab5 (green) tether to DiD-PI(3)P liposomes (magenta) upon 
addition of EEA1 and GTP-1S but not GDP (a, scheme; b, microscopy; 
representative of n= 20). Scale bar, 2 1m. c-e, Analysis of vesicle 
co-localization. Asymmetric (c) and symmetric (e) tethering required 
Rab5, P1(3)P and EEA1, streptavidin-biotin control (d) (mean +s.d., n= 3). 
f-h, In vivo stochastic optical reconstruction microscopy (STORM) 
defines the extension of EEA1. The N-terminal (magenta) and C-terminal 
(green) domains of EEA1 (f) were differentially labelled. Representative 


between heptad repeats creating a more idealized, extended coiled- 
coil. In the swapped EEA1 mutant, we swapped the coiled-coil 
regions between the N and C termini. Electron microscopy analysis 
revealed that the extended mutant was impaired in the Rab5-induced 
conformational change (Fig. 2i and Extended Data Fig. 4a-c). In 
contrast, the swapped mutant was mostly bent, often presented 
kinks, and did not significantly change conformation upon Rab5 
binding (Fig. 2f and Extended Data Fig. 4e-g). These results suggest 
that coiled-coil discontinuities and their physical arrangement are 
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STORM image (g, of n = 22) and quantification of EEA1 extension 

(h, box-whisker plot with median, 25/75 quartiles and minimum/ 
maximum error bars, n = 86, representative experiment) from endosomes. 
Scale bar, 500 nm. i, Bead-supported membrane tethering similar to a 

and b. Representative of n = 20. Scale bar, 21m. j, k, Distance of tethered 
vesicles (magenta) from the membrane (green). The intensity per pixel was 
plotted, fitted to determine the relative distances and quantified (k) (vesicle- 
membrane and Rab5-membrane, representative experiment; box—whisker 
plot as inh, mean +s.d., n= 36 and 14). 


critical for the structure of EEA1 and its Rab5-induced conforma- 
tional change. 

To shed light on how EEA1 adopts a compact conformation upon 
Rab5 binding, we measured the curvature along the contour of mol- 
ecules. We aligned N-terminally MBP-tagged EEA1 and determined 
how the tangents to the contour change by 8 nm steps along the 
contour (Methods and Extended Data Fig. 5). Interestingly, the variance 
of this measure of curvature calculated over the ensemble of molecules 
increased significantly upon Rab5:GTP-S binding (Fig. 2g), indicating 


Figure 2 | EEA1 changes flexibility upon 
Rab5S binding. a, c, i, j, Representative 
examples of rotary-shadowing electron 
microscopy of EEA1 (a), EEA1 + Rab5:GTP-yS 
(c), EEA1-extended (i) and -swapped (j) 
variants. Scale bar, 100 nm; n= 88, n= 212, 
n=90, n= 145, respectively. b, d, Contour 

and end-to-end length histograms for EEA1 
(green, n= 88) and EEA1 + Rab5:GTP-yS 
(magenta, n = 212). e, f, Visual comparison 

of aligned EEA1 proteins. The highlighted 
ends of EEA1 + Rab5:GTP-"S lie significantly 
closer to the origin. Hemispheres demarcate 
50nm. g, Variance of curvature measures along 
the contour of aligned EEA1 + Rab5:GDP 
(green) and EEA1 + Rab5:GTP-7S (magenta) 
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Figure 3 | EEA1 collapse generates a force. a, Scheme of bead-supported 
membranes harbouring EAA1 or Rab5 captured by dual-trap optical 
tweezers. b, c, Traps moved successively closer until interactions (arrows) 
were observed, characterized by increase in force and decrease in variance 
(c). d, Interaction distance consistent with length of extended EEA1. Silica 
microspheres (negative control) in grey. e, Persistence length distributions 
of EEA1 and variants from optical tweezers measurements. f, Force did not 
depend on GTP hydrolysis (P > 0.15); n= 39, 26 respectively. g, Interaction 
duration (log-scale) was prolonged by GTP-1S (P< 10 -*). Mann-Whitney- 
Wilcoxon test (e-g); box—whisker plot with Tukey error bars (e-g). 


that EEA] displays a larger variety of curvatures upon Rab5:GTP bind- 
ing. Such changes occurred along the entire length of the molecule, 
with some regions increasing in flexibility more than others (Fig. 2g), 
but were not observed for the EEA1 mutants (Extended Data Fig. 5f-i). 

Although molecules are adsorbed onto a 2D surface, some aspects of 
their 3D conformations are captured (Methods). Analysis of the kur- 
tosis of the distribution of angles between contour tangents indicated 
that 3D shape fluctuations are retained for the entire contour of EEA1 
in the presence of Rab5:GDP, but only up to 60 nm with Rab5:GTP-7yS 
Methods and Extended Data Fig. 6). Moreover, tangent-tangent 
correlations of the contour in this regime revealed that Rab5:GTP-7S 
binding results in a faster decay. Generally, the worm-like chain 
(WLC) model is used to describe fluctuations in polymer shapes and 
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capture aspects of the physics underlying their shape fluctuations’” 


(Methods). In the WLC model, the polymer is considered a homoge- 
neous molecule with its flexibility determined by a bending stiffness 
reflected in a characteristic length, the persistence length, over which 
correlations between tangents to the contour decay. We applied the 
WLC model to EEA1 and determined an effective persistence 
length of 246 + 42nm for the unbound and 74 + 3 nm for the 
Rab5:GTP-7S-bound ensembles. In contrast, the extended EEA1 mutant 
had similar effective persistence lengths in either state (unbound = 
183 + 13nm and bound = 224 + 25 nm; Supplementary Data Table). 

To corroborate these estimates, we fitted the radial distribution func- 
tions (that is, the probability of observing a given end-to-end distance) 
of the molecules extracted from the electron microscopy data with 
analytical solutions of the WLC model'® (Methods). This showed a 
clear reduction in effective persistence length of EEA1 upon Rab5:GTP 
binding (Fig. 2h). In contrast, the extended EEA1 mutant maintained a 
similar radial distribution regardless of Rab5 (Extended Data Fig. 4d). 

Reducing the persistence length of EEA1 makes the molecule 
flexible. However, the tether is still extended and, therefore, in an 
out-of-equilibrium conformation (Fig. 2e). Asa result, it will undergo 
an entropic collapse, with its end-to-end distance decreasing towards 
a new equilibrium (Fig. 2f). This process generates a force that could 
pull the membranes together (estimated ~3 pN (Methods)). In some 
sense, the extended molecule is like a loaded spring that rapidly recoils 
upon Rab5S binding. 

To provide experimental evidence for entropic collapse of EEA1, 
we made use of high-resolution dual-trap optical tweezers (Methods). 
Two glass 2 |um microspheres coated with membranes were held in optical 
traps (Fig. 3a). One trap was moved closer to the other, in iterative 
cycles of approaching, pausing and retracting (Fig. 3b). At distances 
below 250 nm and at low concentrations of EEA1 (5-40 nM) to ensure 
single-molecule events, we observed transient interactions as a decrease 
in the mean and variance of the distance between the two beads (Fig. 3b, 
red arrows, Fig. 3c and Extended Data Fig. 7a, d). Interactions were 
infrequent, as expected for single molecules and non-existent without 
EEAI, whereas their frequency and duration increased at high concen- 
trations of EEA1 (400 nM) (Extended Data Fig. 7e and Methods). The 
interaction distance was broad (Fig. 3d), with the mean 176 +76 nm 
comparing favourably with rigid EEA1 (Fig. 2b). 

To test the prediction that EEA1 becomes flexible upon Rab5 bind- 
ing, for each tethered molecule we determined its effective persistence 
length from the capture distance, and measured force increase (Fig. 3c) 
and bead displacements using the WLC model (Methods). Strikingly, we 
obtained a median effective persistence length of 23 + 10 nm (Fig. 3e). 
For more than 80% of the molecules the persistence length was no 
more than half of the contour length, confirming that Rab5-bound 
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Figure 5 | Ultrastructural analysis of EEA1 KO and mutant rescue 
cells. a, Dense filamentous network (arrowheads) around an early 
endosome (asterisks) in HeLa. Many smaller vesicular or tubular profiles 
were consistently observed at the network periphery. Representative of 
n= 33.b, A filamentous network was less prominent in HeLa EEA1-KO 
with no obvious concentration of vesicles near the endosomal surface. 
Representative of n = 54. c-e, HeLa EEA1-KO expressing the extended 
EEA] variant showed clusters of vesicles throughout the cytoplasm and 
no classical endosomal morphology. The clusters were clearly delineated 
by a zone of cytoplasm with distinct density (circled areas). Higher 
magnification revealed fine wispy material surrounding the clustered 
vesicles (d, e; arrowheads) and evidence of discrete filaments (between the 
arrowheads in e). Representative of n = 56. Scale bars: a, b, d, e, 500 nm; 

c, 24m. 


EEA1 is flexible. In contrast, the extended EEA1 mutant remained 
significantly more rigid than EEA1 (Fig. 3e). Rab5 binding is neces- 
sary to trigger structural and conformational changes on EEA1. When 
Rab5 was bypassed by His-tag-mediated tethering, EEA] flexibility was 
significantly lower than that of EEA1 with Rab5 (Fig. 3e). 

If EEA1 becomes flexible upon capture, an entropic pulling force 
will be generated. This entropic force balances with the force exerted 
by the optical traps as the molecule undergoes the collapse and as 
the system finds its new equilibrium (Extended Data Fig. 7h)”. For 
a capture distance of 195nm and a peak collapse force of 3 pN, we 
predict a force balance at ~0.6 pN (Methods), consistent with our 
tweezer measurements of 0.5 + 0.3 pN (Fig. 3c). EEA1 binding to Rab5 
requires the GTP-bound form. No significant force differences were 
observed in the presence of the non-hydrolysable analogue GTP-yS 
or GTP (Fig. 3f). In contrast, the duration of the interaction was much 
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prolonged (Fig. 3g), as expected given that GTP-7S stabilizes Rab5 in 
the active form”’. Finally, replacing EEA1-Rab5 binding with 10x 
His-EEA1 tethering to Ni- NTA-beads resulted in a decreased collapse 
force (Extended Data Fig. 7i). 

To validate in vivo the mechanism observed in vitro, we genome- 
edited HeLa cells to disrupt the EEA1 gene (HeLa EEA1-KO; Fig. 4a, 
Extended Data Fig. 8c and Methods), and analysed the distribution of 
Rab5-positive endosomes and the uptake of cargo (low-density lipo- 
protein (LDL)) by confocal microscopy (Fig. 4a). HeLa EEA1-KO dis- 
played a significant reduction in Rab5 endosome size, particularly for 
the largest endosomes (Fig. 4c), and a marked decrease in cargo (LDL) 
uptake (Fig. 4f). Expression of EEA1 rescued the normal, rounded 
morphology of endosomes (Fig. 4b and Extended Data Fig. 8f, i) and 
LDL uptake (Fig. 4c). In contrast, the expression of both extended and 
swapped EEA1 mutants generated enlarged endosomes and inhibited 
cargo uptake (Fig. 4c-f). 

Because the size of endosomes is below the resolution limit of light 
microscopy, we performed electron microscopy on the HeLa EEA1-KO 
cells (Fig. 5 and Extended Data Fig. 9). The filamentous material on 
endosomes'! was much reduced in HeLa EEA1-KO cells (Fig. 5a, b, and 
Extended Data Fig. 8n) and restored by the re-expression of EEA1 on 
endosomes that appeared normal or enlarged, consistent with the light 
microscopy analysis (Fig. 4b). Strikingly, cells expressing the extended 
EEA1 mutant had large (>1 um) clusters of small vesicles, within areas 
filled with filamentous material (Fig. 5d, e), suggesting that they are 
arrested in a tethered state (Fig. 4d, e). The distance between the teth- 
ered vesicles was significantly longer than that between endosomes 
in control cells (Extended Data Fig. 80), consistent with the mutant 
EEA1 being incapable of undergoing entropic collapse to shorter dis- 
tances (Figs 2e and 3e). Similar endosomal clusters were induced by 
the swapped mutant (Extended Data Fig. 8m). 

Our data suggest a new mechanochemical cycle of EEA1 regulated 
by Rab5:GTP binding and GTP hydrolysis. On early endosomes, EEA1 
is in the extended state (Fig. 2e) and increases the probability of cap- 
turing a vesicle bearing RabS. Similarly, it forms a Rab5-selectivity 
barrier (analogous to a polymer brush)*!. When Rab5 on an incoming 
vesicle binds EEA1, it induces an allosteric conformational change, 
from extended to flexible (Fig. 2f). This shows a new function of Rab 
proteins beyond effector recruitment. The reduction in persistence 
length of EEA1 causes its entropic collapse, releasing up to ~14kgT of 
mechanical energy (Extended Data Fig. 7k) and generating up to 3 pN 
of force that could pull the vesicle closer to its target membrane where 
it may diffuse’? or be brought by other Rab5 effectors”** within the 
range of trans-SNARE pairing. This mechanism explains why the Rab5 
machinery dramatically increases the efficiency of SNARE-mediated 
membrane fusion”’. The mechanical energy released by EEA1 is of 
the order of the free energy released by GTP hydrolysis. However, 
the energy required to complete the cycle could potentially also come 
from chaperones. 

A key question is how Rab5 can induce such a long-range allosteric 
effect. This is not uncommon among coiled-coil proteins**”°. The 
entropic collapse mechanism is different, however, for other mem- 
brane tethering factors”’. In the course of this study, the GCC185 
tether was shown to bend through central joints*”. For EEA1, instead 
(1) the arrangement and structure of the coiled-coils and (2) Rab5 
binding are critical for the propagation of allosteric conformational 
changes (Extended Data Fig. 10). We can envisage different mecha- 
nisms (see Supplementary Discussion), such as local register shifts. 
In dynein, dynamics in the heptad register prove critical to func- 
tionally link ATP binding and microtubule binding at opposite ends 
of its coiled-coil stalk’*’’. Further ad hoc structural studies are nec- 
essary to resolve this outstanding problem. The entropic collapse 
upon stiffness reduction could be an effective and general mecha- 
nism used not only by membrane tethers but also by many coiled- 
coil proteins for generating an attractive force in diverse biological 
processes. 
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METHODS 


Statistics. Sample size was not predetermined. For cell electron microscopy, 
samples were double-blind examined. Other experiments were not randomized 
or blinded. Box-whisker plots all show median, 25/75 quartiles by box bounda- 
ries and minimum/maximum values by errors, with the exception of Fig. 3 and 
Extended Data Fig. 7 which use Tukey-defined error bars. 

Cloning, expression and purification of proteins. Human Rab5-6 x His and GFP- 
Rab5-6 x His were expressed and purified essentially as previously described in the 
Escherichia coli expression system’. Human Rabex-5 amino-acid residues 131-394 
were PCR and restriction cloned into a pGST-parallel2 vector containing a TEV 
cleavable N-terminal glutathione-S-transferase (GST)”?*°. Expression and purifi- 
cation was performed essentially as described*". Briefly, E. coli-expressed proteins 
were transformed into BL21(DE3) cells and grown at 37°C until absorbance at 
600 nm (A¢o0 nm) of 0.8, whereupon the incubator was reduced to 18°C. After 
30min, cultures were induced with 0.1 mM IPTG and grown overnight (16h). Cell 
pellets were resuspended in standard buffer (20 mM Tris pH7.4, 150mM NaCl, 
0.5mM TCEP) and flash frozen in liquid nitrogen. All subsequent steps performed 
at 4°C or on ice. Cell pellets were resuspended in standard buffer supplemented with 
1mM MgCl, for GTPases, and protease inhibitor cocktail (chymostatin 6 1g/ml, 
leupeptin 0.5 1g/ml, antipain-HCl 10,1g/ml, aprotinin 2 \.g/ml, pepstatin 0.7 g/ml, 
APMSF 10j1g/ml), homogenized and lysed by sonication. Histidine-tagged pro- 
teins were bound in batch to Ni-NTA resin in the presence of 20 mM imidazole, 
and eluted with 200 mM imidazole. GST-tagged proteins were purified on GS resin 
(GS-4B, GE Healthcare) by binding for 2h followed by stringent washing, and 
cleavage from resin overnight. Imidazole-containing samples were immediately 
diluted after elution and tags cleaved during overnight dialysis. Following dialysis 
and tag cleavage, samples were concentrated and TEV or HRV 3C protease was 
removed by reverse purification through Ni-NTA or GS resin. Samples were then 
purified by size-exclusion chromatography on Superdex 200 columns in standard 
buffer. 

Human EEA1 was purified as a GST fusion in a pOEM series vector (Oxford 
Expression Technologies) modified to contain a HRV 3C-cleavable N-terminal 
GST and protease cleavage site or from a modified pFastbacl vector (Thermo 
Fisher Scientific)”*. Some samples were also purified as 6 x His-MBP and 10x His 
fusions from a modified pOEM vector (rotary shadowing for N-to-C terminus 
alignment, and optical tweezer control, respectively; all other experiments 
performed with tags removed). Mutants were purified identically to wild-type 
EEA. 

SF9 cells growing in ESF921 media (Expression Systems) were co-transfected 
with linearized viral genome and the expression plasmid and selected for high 
infectivity. P1 and P2 virus was generated according to the manufacturer's protocol, 
and expression screens and time courses performed to optimize expression yield. 
Best viruses were used to infect 1-2 1 SF9 cells at 10° cells/ml at 1% vol/vol and 
routinely harvested after 40-48 h at about 1.5 x 10° cells/ml, suspended in standard 
buffer and flash frozen in liquid nitrogen. Pellets were thawed on ice and lysed by 
Dounce homogenizer. Purification took place rapidly in standard buffer at 4°C 
on GS resin in batch format. Bound protein was washed thoroughly and cleaved 
from resin by HRV 3C protease overnight. Proteins retaining 6 x His-MBP tags 
were purified on amylose resin and eluted with 10 mM maltose. Protein retaining 
10x His were eluted from Ni-NTA resin in standard buffer supplemented with 
200 mM imidazole. All EEA1 and mutants were immediately further purified by 
Superose 6 size-exclusion chromatography where they eluted as a single peak. All 
experiments were performed with a preparation confirmed for Rab5 and PI(3)P 
binding. Concentrations were determined by UV280 and Bradford assay. All 
proteins were aliquoted and flash frozen in liquid nitrogen and stored at —80°C. 

EEA] variants extended and swapped were synthesized genes optimized for 
insect cell expression (Genscript). The extended mutant has regions of low coiled- 
coil prediction removed, resulting in an EEA1 construct 1,286 amino acids in 
length (versus 1,411 in wild-type EEA1) (see Extended Data Fig. 3). The swapped 
mutant has the C-terminal portion of the coiled-coil rearranged to follow the 
N-terminal Zn?*-finger domains, and the N-terminal portion of the coiled-coil 
therefore rearranged to the C-terminal region of EEA1. Variants were treated iden- 
tically to wild-type EEA1 in purification. 

Static light scattering. An autosampler equipped Viskotek TDA Max system was 
used to analyse the light-scattering from purified EEA1. Sample was loaded the 
autosampler and passed through a TSKGel G5000PW column (Tosoh Biosciences) 
and fractions were subjected to scattering data acquisition. Data obtained were 
averaged across the protein elution volume and molecular masses determined in 
OmniSEC software package. 

Lipids. The following lipids were purchased and used directly: DOPC, DOPS, 
DOGS-NiNTA, RhoDPPE (Avanti), DiD (Invitrogen) and PI(3)P (Echelon 
Biosciences). Lipids were dissolved in chloroform, except PI(3)P in 1:2:0.8 
CHCl;:MeOH:H30. All were stored at —80°C. 


Rab5/PI(3)P binding by EEA1. Early endosome fusion assay was performed as 
previously described". To assess the ability of EEA1 to bind competently in a 
GTP-dependent manner to Rab5, Rab5 was bound to GS resin and subsequently 
loaded with nucleotide (GDP, GTP-yS) as previously described®. Binding of 
EEA1 and all variants to immobilized Rab5 proceeded forl1 h at room temper- 
ature, and the washed Rab5 resin was evaluated for EEA1 binding by western 
blot. Similarly, the binding of EEA1 to PI(3)P containing liposomes was evaluated 
as previously described by formation of liposomes composed of DOPC:DOPS 
or DOPC:DOPS:PI(3)P (85:15 or 80:15:5 respectively)*. Briefly, liposomes were 
formed from the hydration of lipids at 1 mM in standard buffer, and combined 
with EEA] for 1h before ultracentrifugation to separate supernatant and pellet 
for western blotting to evaluate EEA1 sedimentation. Rabbit anti-EEA1 antibody 
was made in our laboratory. 

Preparation of liposomes. Liposomes were formed by extrusion as pre- 
viously described**. Liposome compositions for fluorescence microscopy 
tethering assays were DOPC:DOPS:DOGS-NiNTA, DOPC:DOPS:PI(3)P, 
DOPC:DOPS:biotin-DPPE, with RhoDPPE and DiD where applicable. Liposome 
compositions for bead-supported membranes were DOPC:DOPS:DOGS-NiNTA, 
DOPC:DOPS:PI(3)P. Solvent was evaporated under nitrogen and vacuum over- 
night. The resulting residue was suspended in standard buffer, rapidly vortexed, 
freeze-thawed five times by submersion in liquid N2 followed by water at 40°C, and 
extruded by 11 passes through two polycarbonate membranes with a pore diameter 
of 100 nm (Avestin). Vesicles stored at 4°C were used within 5 days. 
Bead-supported bilayer preparation. Silica beads (2 1m NIST-traceable 
size-standards for optical tweezers, or 10|1m standard microspheres for microscopy; 
Corpuscular) were thoroughly cleaned in pure ethanol and Hellmanex (1% sol., 
Hellma Analytics) before storage in water. Supported bilayers were formed as previ- 
ously described with modifications**. Liposomes composed of DOPC:DOPS 85:15 
(with 5% PI(3)P and DOGS-NiNTA where applicable) were added to a solution 
containing 250mM NaCl for tethering assays (101m) and 100 mM for optical 
tweezers (2jum), and 5 x 10° beads. Liposomes were added to final concentration of 
100\1M and incubated for 30 min (final volume 10011). Samples were washed with 
20 mM Tris pH7.4 three times by addition of 1 ml followed by gentle centrifugation 
(at 380g). Final wash was with standard buffer. Salt concentrations were optimized 
by examination of homogeneity at the transverse plane followed by examination 
of the excess membrane at the coverslip plane (see Extended Data Fig. 2a—d). 
We found that the membranes were extremely robust in conditions where the 
bilayer is fully formed, and could be readily pipetted and washed, consistent with 
previous reports*°. Membrane-coated beads were used within 1h of production 
and always stored before use on a rotary suspension mixer. 

Confocal microscopy of vesicle-vesicle tethering assay. Glass coverslips were 
cleaned in ethanol, Hellmanex and thoroughly rinsed in water. In these experi- 
ments, the following concentrations were used: 1 nM Rabex-5 (131-394), 100nM 
Rab5-6 x His, 120nM EEA1. Experiments were performed in standard buffer with 
5mM MgCl and 11M nucleotide. Liposomes and proteins were pre-mixed in 
low-binding tubes at concentrations indicated, incubated for 5 min and imaged 
immediately upon addition to the coverslip. Images were acquired with a Nikon 
TiE equipped with a 60x plan-apochromat 1.2 numerical aperture W objective 
and Yokagawa CSU-X1 scan head. Images were acquired on an Andor DU-897 
back-illuminated CCD. Acquired images were processed by the SQUASH package 
for Fiji®”. 

Confocal microscopy of bead-supported membrane tethering assay. A 20011 
observation chamber (,1-Slide 8 well, uncoated, #1.5, ibidi) was pre-blocked with 
BSA (1 mg/ml in standard buffer) for 1.5-2h and washed thoroughly. Finally, 180 il 
of standard buffer containing beads was added to the sample chamber. In these 
experiments, the following concentrations were used: 1nM Rabex-5 (131-394), 
100 nM GFP-Rab5-6 x His, and the given EEA1 concentrations (between 30 
and 400 nM). Nucleotide control experiments were performed at 190nM EEA1. 
Experiments were performed in standard buffer with 2mM MgCl, and 1mM 
nucleotide. Altogether Rab5, Rabex5, nucleotide, EEA1 and buffer were mixed 
in low-binding tubes at concentrations indicated, and were added to 240 1] final 
volume to assure mixing throughout the chamber volume. 

Images for co-localization analysis were acquired with a Nikon TiE equipped 
with a 60 plan-apochromat 1.2 numerical aperture W objective and Yokagawa 
CSU-X1 scan head. Images were acquired on an Andor DU-897 back-illuminated 
CCD. Acquired images were processed by the SQUASH package for Fiji*”. 

Data obtained for distance measurements were acquired in the same way and 
processed in Fiji by determining line profiles eight pixels wide from the centre of the 
bead outwards over an observed vesicle. These profiles were fitted with a Gaussian 
distribution. The alignment of the microscope was confirmed by imaging of 
sub-diffraction beads, revealing no clear systematic shift and a maximum positional 
error of 21 nm determined in Motion Tracking!®. Controls with sub-diffraction- 
sized multicolour particles (Methods) and distance measurements between Rab5 
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itself and its resident membrane were within the measurement error of the tech- 
nique (approximately 15nm)**. 

Super-resolution imaging of EEA1 termini. HeLa cells were stained using primary 
antibodies against EEA1 N terminus (610457, prepared in mouse, BD Biosciences) 
and EEA1 C terminus (2900, prepared in rabbit, Abcam). The secondary antibodies 
were anti-mouse Alexa568 antibody (A-11004, prepared in goat, Life Technologies) 
and anti-rabbit Alexa647 (A-21244, prepared in goat, Life Technologies). 
Coverslips were mounted in STORM buffer (100 mM Tris-HCl pH8.7, 10 mM 
NaCl, 10% glucose, 15% glycerol, 0.5 mg/ml glucose oxidase, 40 1g/ml 
catalase, 1% BME) and sealed with nail polish. Cells were imaged on a Zeiss Eclipse 
Ti microscope equipped with a 150 mW 561 nm laser and a 300 mW 647 laser. 
For imaging, lasers intensities were set to achieve 50 mW at the rear lens of the 
objective. Illumination was applied at a sub-TIRF angle through the objective 
to improve the signal to noise ratio. Videos of 24,000 frames (12,000 frames per 
channel) were acquired by groups of 6 consecutive frames using the NIS Elements 
software (Nikon). Images were aligned using 100 nm Tetraspeck beads (Thermo 
Fisher). This software was also used for peak detection and image reconstruction. 
The localization of the EEA1 termini could be distorted a maximum of approxi- 
mately 20 nm owing to the size of the antibodies. The localization accuracy of the 
secondary antibody was ~25 nm. Measured distances were determined in Fiji and 
represent distances between respective centres-of-mass. Representative experiment 
is shown, n=3. 

Sample preparation for optical trap experiments. Bead-supported membranes 
were prepared as described. The concentrations used were as in the microscopy 
experiments: 1nM Rabex-5 (131-394), 100nM Rab5-6 x His and EEA1 concen- 
trations (between 30 and 400 nM). Most experiments were performed at 40nM 
EEA1, with additional trials taking place at 4 and 400 nM. At lowest concentrations, 
single transient events became difficult to observe (<5% had interactions). At the 
highest concentrations, events were often non-transient or repeated. 

Electron microscopy. Samples were rotary-shadowed essentially as described*’. 
Briefly, samples were diluted in a spraying buffer, consisting of 100 mM ammonium 
acetate and 30% glycerol. Diluted samples were sprayed via a capillary onto freshly 
cleaved mica chips. These mica chips were mounted in the high vacuum evapo- 
rator (MED 020, Baltec) and dried. Specimens were platinum coated (5-7.5 nm) 
and carbon was evaporated. Following deposition, the replica was floated off and 
examined at 71,000 magnification and imaged onto a CCD (Morgagni 268D, 
FEI; Morada G2, Olympus). 

Analysis of electron microscopy. Images obtained were processed in ImageJ by 
skeletonizing the particles. Lengths were determined directly from these data and 
represent an overestimation due to the granularity of the platinum shadowing 
(5-7.5nm granules). The bouquet plots were generated by aligning the initial five 
segments of the molecules and the entire population set was plotted. 

To determine the curvature measure, we first took the skeletonized curves and 
smoothed them with a window of 8.2 nm. These curves were then segmented 
with 301 equally spaced points, and these smoothed curves were used for the cur- 
vature calculation. We first attempted to define curvature at one segment length 
(~0.75 nm) but this analysis was too noisy to obtain meaningful description of 
the curves. We therefore determined the curvature by taking the difference of 
the tangents and diving it by the arc length at a distance of ~15 nm (20 points). 
The variance of this measure was determined, and bootstrapping with resampling 
was used to determine errors over the whole population and for 1,000 iterations. 

Although proteins are not homogeneous polymers, the WLC model cap- 
tures essential aspects of the physics underlying their shape fluctuations*“". 
Calculation of fits to all mean tangent-correlations and the equilibration analysis 
were performed using Easyworm source code in Matlab’. First, the original skel- 
etonized curves were segmented with 301 equally spaced points. These data were 
then used to calculate the tangent-correlations and the kurtosis plots. We fitted 
the regime whereby the kurtosis measurement defined that the molecules were 
equilibrated'*“**, This distance therefore varied (see Extended Data Fig. 6, kurto- 
sis plots), but the estimation of persistence length was only weakly dependent on 
this distance. The fitting routines were then implemented up to the thermal equili- 
bration distance with bootstrapping with resampling, which was run for the whole 
population and 1,000 times to obtain errors. These are given as mean + standard 
deviation. For values and fit statistics, please refer to Supplementary Data Table. 
We did not apply the WLC model to the swapped mutant (Extended Data Fig. 4h) 
because of the lack of significant structural changes upon RabS binding (Fig. 2f 
and Extended Data Fig. 4f). 

The analytical fitting to the radial distribution functions was performed in 
Python'®. The radial distribution function for a worm-like chain is the probabil- 
ity density for finding the end points of the polymer. The polymers are considered 
as embedded in a two-dimensional space in this scheme. This treatment adopts 
the continuum model of the polymer, thereby defining the statistical properties 
via free energy calculation. Fitting to analytical solution of the WLC yielded a 
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mean effective persistence length of 270 + 14nm for EEA] alone (mean + error 
of fit), and two populations of effective persistence lengths (26 + 2 nm (67%) and 
300 + 14nm (33%)) for EEA] in the presence of Rab5:GTP-\S. 

Optical tweezer experiments. A custom-built high-resolution dual-trap optical tweezer 
microscope was used*“®, A single stable solid-state laser (Spectra-Physics, 5 W) 
was split by polarization into two traps that could be independently manoeuvred. 
Forces were measured independently in both traps by back-focal plane interfer- 
ometry. Absolute distances between the two traps were determined by template- 
based video microscopy analysis (43 + 2 nm per pixel) and offset-corrected for 
each microsphere pair by repeatedly contacting the microspheres after each exper- 
iment. The template detection algorithm had subpixel accuracy, at an estimated 
uncertainty in absolute distance measurements to be not more than + 20 nm. Bead 
displacement was calculated according to AF= —K Ay. Extended Data Fig. 7g 
demonstrates the sensitivity of the instrument via the Allan deviation“” for aver- 
aging times greater than 100 ms. 

All optical tweezer experiments were performed with 21m silica size-standard 
microspheres (Corpuscular), at a temperature of 26 +2°C in a laminar flow chamber 
with buffers containing 35% glycerol to prevent sedimentation of the silica micro- 
spheres. Thermal calibration of the optical traps was performed with the power 
spectrum method using a dynamic viscosity of 3.1 mPas (ref. 48) (mean trap 
stiffness: trap 1, &; = 0.035 + 0.007 pN/nm; trap 2, #2 = 0.029 + 0.007 pN/nm), 
leading to an overall trap stiffness of k= 0.0159 pN/nm (yellow response curve in 
Extended Data Fig. 7h). Data were acquired at 1 kHz and further processed using 
custom-written software in R. Spurious electronic noise at 50 Hz was filtered using 
a fifth-order Butterworth notch filter from 49 to 51 Hz. 

For probing the interactions of EEA1 with Rab5 without any assumptions on the 
shape of EEA1, a distance agnostic protocol with consecutive cycles of approach- 
ing, waiting (20s) and retraction was used, approaching closer in each iteration 
(Fig. 3b). The stationary segments were then subjected to automatic change-point 
analysis to identify regions of the time series longer than 100 ms with significantly 
different mean and variance’’. Events thus identified were classified as transient 
if the mean and variance went back to base levels within the stationary segment 
(see examples in force traces in Fig. 3c and Extended Data Fig. 7). Mean times 
of interactions were 3.4+0.6s for GTP-)S and 0.9+0.2s for GTP. A fluctuation 
analysis of the differential distance signal during these events gave an estimated 
tether misalignment of less than 30° in all interactions®°. Only transient events 
were further processed. Silica beads alone as a negative control measured a mean 
contact distance of 22 nm (Fig. 3d, grey). 

To calculate the persistence length for individual captured molecules we deter- 
mined the equilibrium extension, z.4, from the capture distance D (nm), the average 
measured force increase upon tethering AF (pN) and the known displacements 
from each trap Ax; = AF/K and Ax; = AF/k2 as Zeq = D— Ax, — Ax. With this 
distance, the persistence length was calculated according to*! 


kpT Zeq 1 q 1 
AF| L 4. 4(— Zeq/L)? 


Similarly, to estimate the magnitude of the entropic collapse force, this formula was 
applied to the equilibrium extensions of EEA1, as estimated by the end-to-end dis- 
tances of the molecules from electron microscopy. Values determined were (median 
and bounds at (2.5%, 97.5%)) EEA], 23 (14, 33) nm; extended, 73 (60, 88) nm; 
swapped, 26 (21, 30) nm; 10x His, 78 (35, 140) nm. Values reported are medians 
and 95% confidence intervals determined from bootstrapping. 

Generation of HeLa EEA1-KO cell line. HeLa EEA1-KO lines were generated 
using CRISPR-Cas9 technology” on HeLa-Kyoto cell lines obtained from the BAC 
recombineering facility at the Max Planck Institute of Molecular Cell Biology and 
Genetics. Cell lines were tested for mycoplasma and authenticated (Multiplexion, 
Heidelberg). pSpCas9(BB-2A-GFP (PX458) and pSpCas9(BB)-2A-Puro (PX459) 
were a gift from F. Zhang (Addgene plasmid 48138, 48139). A PX458 plas- 
mid encoding a GFP-labelled Cas9 nuclease and the sgRNA sequence (from 
GECKO* library 17446, GTGGTTAAACCATGTTAAGG, targeting first exon) 
was transfected into standard HeLa Kyoto cells with Lipofectamine 2000 fol- 
lowing the manufacturer’s instructions. Cells were cultured in DMEM media 
supplemented with 10% FBS and 1% penicillin-streptomycin at 37°C and 5% 
CO,. After 3 days, the transfected cells were FACS sorted by their GFP fluores- 
cence into 96-well plates to obtain single clones and visually inspected°*. These 
clones were then screened by western blotting and in-del formation confirmed 
sequencing of genomic DNA (primer forward, AGCGGCCGTCGCCACCG; 
reverse, TAAGCGCCTGCCGGGCTG). Note the region is extremely GC-rich 
(75%, + 250 nt from targeted indel region). Additionally, a mixed-clonal line 
was obtained by transfection of HeLa Kyoto with PX459 with the above sgRNA 
sequence. After 72h from transfection, cells were exchanged into media sup- 
plemented with 0.5 j1g/ml puromycin (concentration determined in separated 
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experiment) and selected for 3 days. All imaging experiments were confirmed 
on this secondary line. 

Endocytosis rescue assays. Wild-type EEA1 and the extended and swapped 
variants (Extended Data Fig. 3) were cloned into customized mammalian expres- 
sion plasmids under the CMV promoter resulting in untagged proteins. HeLa or 
HeLa EEA1-KO cells were seeded into 96-well plates and transfected (or mock 
transfected) after 48h. Following 48 h after transfection, cells were exchanged into 
serum-free media containing 8.2 1g/ml LDL-Alexa 488 (prepared as previously 
described!®) or 100 ng/ml EGF-Alexa 488 (E13345, Thermo Fisher) for 10 min at 
37°C, and washed in PBS then fixed in 4% paraformaldehyde. 

Automated confocal immunofluorescence microscopy and analysis. Fixed cells 
were stained with antibodies against EEA1 (laboratory-made rabbit) and Rab5 
(610724, prepared in mouse, BD Biosciences) as previously described**. DAPI 
was used to stain the nuclei. Not all early endosomes harbour EEA] (ref. 54) and 
other tethering factors could compensate for EEA1 (refs 24, 55). All imaging was 
performed on a Yokogawa CV7000 s automated spinning disc confocal using a 
60x 1.2 numerical aperture objective. Fifteen images were acquired per well and 
each condition was duplicated at least twice per plate, resulting in 30 or more 
images per condition. 

Image analysis used home-made software, MotionTracking, as previously 

described**°”. Images were first corrected for illumination, chromatic aberration 
and physical shift using multicolour beads. All cells, nuclei and cell objects in 
corrected images were then segmented and their size, content and complexity 
calculated. The intensity of EEA1 in wild-type HeLa cells was measured to deter- 
mine a wild-type intensity distribution. In the rescue experiments, an intensity 
threshold for the transfections was set at about two times the mean of wild-type 
cells (Extended Data Fig. 8i). Experiments were repeated at different seeding den- 
sities with similar results. Given a cell density threshold between 10 and 100 per 
image, we obtained an average of more than 300 cells per condition after filtering for 
the transfection level of EEA1, and more than 15,000 endosomes per experiment. 
A two-tailed t-test was used for significance calculations. 
Cell electron microscopy. Cells in 3cm diameter plastic dishes were processed 
for electron microscopy using a method** to provide particularly heavy staining 
of cellular components. Briefly, cells were fixed by addition of 2.5% glutaraldehyde 
in PBS for 1h at room temperature and then washed with PBS. The cells were 
then processed as described°® with sequential incubations in solutions containing 
potassium ferricyanide/osmium tetroxide, thiocarbohydrazide, osmium tetrox- 
ide, uranyl acetate and lead nitrate in aspartic acid before dehydration and flat 
embedding in resin. Sections were cut parallel to the substratum and analysed 
unstained in a JEOL 1011 transmission electron microscope (Tokyo, Japan). 
Images for quantitation were collected from coded samples (double blind) to 
avoid bias. 

Distance analysis used ImageJ. To correct for thickness of slices (60 nm), the 
following equation was used: 


= 7), PE ae 


where P(r) is the apparent 2D distance distribution, R is the 3D distance, H is the 
thickness of the slice and Z is the normalization constant. Uncorrected distance 
was measured at 119.8 ++ 78.2nm (mean +s.d.), which resulted in 130.0+76.8nm 
corrected. 
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Extended Data Figure 1 | EEA1 is a predicted extended coiled-coil 
dimer that binds Rab5 in a GTP-dependent manner and extends 
outwards from endosomes a, Human EEA1 in COILS prediction reveals 
a clear coiled-structure flanked by the Rab5-binding Zn?*-finger on the 
N terminus and P1(3)P binding FY VE domain on the C terminus. 

b, Coomassie-stained gel of human EEA1 expressed as a GST fusion 

in SF+ insect cells and purified by GS affinity, cleaved on resin, and 
subsequently concentrated and separated from smaller contaminants by 
size-exclusion chromatography on a Superose 6 column. ¢, Static light 
scattering in line with size-exclusion chromatography reveals a molecular 
mass of 323 kDa, compared with a theoretical molecular mass of 326 kDa 
for a dimeric protein. d, Purified protein binds Rab5 in both standard 
and optical tweezer conditions (35% glycerol) in a GIP-dependent 
manner. GST or GST-Rab5 was purified and conjugated to GS resin, and 
subsequently nucleotide was exchanged to either GTP-\S or GDP using 
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EDTA-Mg"*-mediated exchange and subsequent wash. The GST resin was 
then incubated with EEA1 in either the standard or optical tweezers buffer, 
washed three times, and beads were then blotted for EEA1. e, Recombinant 
EEA binds specifically to PI(3)P liposomes. When mixed with 
POPC:POPS 85:15 liposomes, no EEA1 is observed in the liposome pellet 
(CTRL). In contrast, EEA1 is pelleted with control POPC:POPS:PI(3)P 
80:15:5 liposomes (PI3P). f, The N-terminal Zn*" -finger and C-terminal 
FYVE domain of EEA1 were differentially labelled with specific antibodies 
and STORM microscopy performed to define their localization in HeLa 
cells. Representative STORM images of EEA1 radial extension from 
endosome of n = 22. Scale bar, 500 nm. g, h, Primary antibody binding 
controls for N and C termini. Primary antibodies for the N (g) and C (h) 
termini were left out of the staining, resulting in no unspecific secondary 
staining for each. Representative of n=5. Scale bar, 500 nm. 
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Extended Data Figure 2 | See next page for caption. 
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Extended Data Figure 2 | Validation of bead-supported lipid bilayers 
for optical tweezers, and bead tethering experiment controls and 
methods. To optimize the conditions for forming supported lipid bilayers 
on the 2-10 1m beads, we systematically investigated the dependence of 
membrane formation on salt and liposome concentration. a, Fluorescent 
profiles of supported lipid bilayer bead cross sections. At high liposome 
concentration (100M, solid line) during formation of the bilayer on 

the silica bead, the bead-supported membrane fluorescence intensity is 
circumferentially homogenous. At lower lipid concentrations (10 and 11M, 
dashed and dotted lines), less than full coverage is achieved and the 
supported bilayer is inhomogeneous. b, Consistent with previous reports, 
increasing salt concentrations result in more homogenous membrane 
coverage. c, Representative examples of the ‘spilled-out’ membrane of 
beads prepared at 100 mM (top, blue) and 250 mM (bottom, red) NaCl 
salt and 100m liposomes, of n=5. d, Histogram of the size of membrane 
spilled from the beads onto the substrate when prepared at 100 and 
250mM NaCl (blue and red, respectively). This indicated that the lower 
salt samples (blue) were homogenously covered with membrane and 

that they had little excess present, and therefore the optimal conditions 
for formation of membrane on the silica beads used in tethering and in 
optical tweezer experiments. e, Segmentation of beads and vesicles by 
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the SQUASH method. Bead-supported bilayers and vesicles (green and 
magenta, respectively) were segmented as illustrated by red outlines to 
determine their co-localization. Representative of n= 1 generated for 
schematic. f, Methodology comparison for co-localization in GDP and 
GTP-\$S conditions. All methods give P < 0.01 in a two-tailed Student’s 
t-test. Co-localization by signal is better than by size or object, as vesicles 
become undercounted at high concentrations. Mean +s.d.,n=5. 

g, Co-localization of liposomes (PI(3)P, magenta) to the bead-supported 
membrane (GFP-Rab5, green) was strictly dependent on GTP-\S. 
Box-whisker plot with minimum/maximum error, n=5.h, The 
co-localization of liposomes to the supported membrane was dependent 
on EEA1 concentration. At higher concentrations of EEA1, co-localization 
approached 100%. These concentrations are within the range of the 
concentration of endogenous protein”*. Mean +s.d., n=5. i, Time-lapse 
micrographs of the bead-supported bilayer labelled with GFP-Rab5 
(green), and a dynamically tethered vesicle (magenta). Vesicles were 
observed to tether and reversibly leave the membrane, as well as diffuse 
about its surface. Images displayed were acquired at 350 ms intervals as 
z-stacks. Representative of n = 1 to acquire video. Scale bar, 2|1m. 

j, Example fits for radial line-profile data. 
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Extended Data Figure 3 | Structure prediction and sequence description 
of EEA1 mutants. a, COILS prediction for extended EEA1 mutant, 
revealing removal of most of the discontinuities in the coiled-coil. 

b, c, The swapped EEA1 mutant has a rearranged coiled-coil. The coiled-coil 
was split as indicated by red triangles in the original EEA1-WT (b), and 
the two regions a (shaded green) and b (shaded magenta) were rearranged 
in a synthetic gene, producing the swapped EEA] variant maintaining 

the features and sequence of the original coiled-coil, but in an alternative 


location (c). d, Full sequence alignment for human EEA1 and the extended 
and swapped mutants used in the study. The crystal structure (Protein 
Data Bank accession number 3MJH) for the Zn?*-finger domain is 
marked in dark blue close to the N terminus. Segment a of the coiled-coil 
region is marked in green, and segment b in magenta. The crystal structure 
(Protein Data Bank accession number 1JOC) of the C-terminal FY VE 
domain and portion of the coiled-coil is marked in cyan. Details of the 
mutant constructs are found in the Methods. 
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Extended Data Figure 4 | Extended and swapped EEA1 mutants exhibit 
limited changes in the presence of Rab5:GTP-‘S. a, e, Rotary-shadowed 
EEA 1-extended particles and EEA1-swapped mutants were skeletonized 
and analysed in ImageJ for contour length (top), resulting in normally 
distributed contour length histograms. The end-to-end length 

histograms (bottom) are similarly distributed. These data were collected 
on N-terminally MBP-tagged samples. Compare with wild-type in Fig. 2b, d; 
n= 212 for the extended and n= 93 for the swapped variants. 

b-d, f, g, The EEA1 mutants revealed limited changes to their curvature 
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in the presence of Rab5:GTP-\S (b, f; compare Fig. 2i, j), and therefore 
minor changes to their contour and end-to-end length histograms (c, g) 
and radial distribution plots (d, h); n = 80 for the extended and n = 47 for 
the swapped variants. i, j, Rotary-shadowing electron microscopy of EEA1 
in the presence of Rab5:GDP (n= 90), N-terminally MBP-tagged, revealed 
no change in appearance compared with the absence of Rab5 entirely 

(Fig. 2a), and no effect of N-terminal tagging relative to wild-type EEA1. 
k, Radial distribution function of EEA1 in the presence of Rab5:GDP 


(compare d, h; Fig. 2g); n= 90. 
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Extended Data Figure 5 | See next page for caption. 
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Extended Data Figure 5 | Representative segmentation, smoothing 

and signed curvature measures for EEA1, and averages for EEA1 and 
mutants. EEA] and EEA1 mutants were skeletonized and smoothed using 
a moving average filter with a window of 8.2 nm, segmented to 300 equally 
spaced segments and aligned N terminus to C terminus by recognition of 
an N-terminal MBP-tag. Their curvature was calculated at 15 nm distances 
along the length of the proteins and plotted. a—c, Representative examples 
of rotary shadowing derived EEA1 curves. The original data appear in 

the first panel, with the second panel revealing the data after smoothing 
for comparison (Methods). The curvature measure, determined by 

how the tangents to the contour change at a distance of 15 nm along the 
contour is plotted below. Note that the choice of sign for the curvature 
measure is arbitrary for each molecule. d, e, Curvature measure and 
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variance of this measure for EEA1 in the presence of Rab5:GDP (green) 
and EEA1 in the presence of Rab5:GTP-7S (magenta); n = 90, n= 145, 
respectively. Alignment of EEA1 curvature from the electron microscopy 
data reveals an increase in curvature over the length of the molecule 
upon Rab5 binding, whereas the extended and swapped EEA] variants 
show no change. All curvature values were taken to be positive given 
that the N-terminal MBP could be recognized but the handedness of the 
molecule adsorbed to the grid could not be inferred. Bootstrapping with 
resampling at full population size was performed for 1,000 iterations to 
determine errors. f, g, Extended EEA1 variant in the absence (green) and 
in the presence of Rab5:GTP-S (magenta); n = 212, n = 80, respectively. 
h, i, Swapped EEA] variant in the absence (green) and in the presence of 
Rab5:GTP-7S (magenta); n = 93, n= 47, respectively. 
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Extended Data Figure 6 | Detailed persistence length and equilibration 
analysis for EEA1 and variants. To validate the methodology used for 
analysis of the persistence lengths, and to assure internal consistency 

in analysis methods, we systematically applied the analysis to EEA1 

(and mutants, see Supplementary Data Table). The skeletonized curves 
were segmented to 300 equally spaced segments, where 0 describes the 
angle between segments. The tangent-tangent correlations were then 
determined for the entire ensembles. a-h, To determine the molecular 
equilibration of EEA1 and variants from 3D to 2D, the kurtosis of the 
theta distribution (top) was calculated. Full equilibration to 2D gives a 
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value of 3.0, and for 3D the expected value is 1.8 as the angle distributions 
become Gaussian. As expected, the measured kurtosis is approximately 
3.0 until lengths above the persistence length of the molecule, where 

the equilibration begins to fail. The value at which the kurtosis began to 
diverge from 2D was taken as the limit for subsequent measurements, as 
beyond this limit (red shaded region) 3D fluctuations are not retained and 
as such the consequences of surface adsorption are uncertain. Next, the 
tangent-tangent correlation was calculated across the ensemble and fitted 
up to the divergence of the kurtosis (red shaded region). 
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Extended Data Figure 7 | See next page for caption. 
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Extended Data Figure 7 | Supplementary data related to optical 
tweezer experiments. a, Change-point analysis was used to identify 
changes in the mean and variance of the combined force signal. An 
example plot of averaged force (linear combination of signals from both 
traps) with respect to time. Data have been collected at 1 kHz. Two long 
transient interactions can be clearly identified. b, c, Cross-correlation 

of the force signals from each trap are not sufficient to reveal stepwise 
interactions as they are time-averaged. By applying cross-correlation over 
a correlation window of 0.8s (b) or 0.3s (c), long transient interactions 
(that is, at ~4s) could be identified. However, an unbiased identification 
of short transients (that is, at ~9s) by this method was not possible. All 
identified long transient interactions showed characteristic changes in 
the cross-correlation: anti-correlation as beads are pulled together, and 
correlation after tethering was established. d, Change-point analysis 

was used to detect both changes in mean and variance of the combined 
force signal, and thereby identify transient interactions (red line). This 
procedure has the additional advantage of defining clear boundaries 

to stepwise processes. e, The possibility of multiple tethers taking part 

in the reaction was observed. Averaged force trace for wild-type EEA1 
occasionally showed signals consistent with multiple interactions (cyan), 
in addition to single transient interactions (red). f, Zoom into time series 
around the transient interaction identified in the previous panel. To a 
first approximation, the dynamic interactions were fitted as piecewise 
constant steps (red). Note also two very short (<10 ms) spikes of similar 
magnitude (to the left and right of identified interaction) occurred but are 
not used in further analysis. Only transients with a duration longer than 
100 ms were analysed. g, To illustrate the sensitivity of the optical tweezer 
experiments, a noise analysis was performed on the segment outlined in 
the top panel (yellow, labelled Allan analysis). The Allan deviation (square 
root of Allan variance, in piconewtons) gives a threshold for detecting a 
signal change over different averaging windows. All detected transients 
(blue) are at minimum an order of magnitude above this threshold. 

To provide perspective, the transient in the above example is indicated 

as a red dot. h, The entropic collapse force is balanced in the tweezer 
experiments below its peak value. The balance between the average 
restoring force in the optical traps (brown) and the entropic collapse force 
of EEA1 (blue) in the bound state gives the measured equilibrium force 


and extension (red dot). The schematic assumes the measured capture 
distance of 195 nm, a persistence length in the Rab5:GTP-bound state of 
Ap = 26 nm, and a contour length of 222 nm. The overall trap response 

of the dual-trap system is treated as two springs in series with the mean 
trap stiffness in trap 1 (K; =0.035 +0.007 pN/nm) and the mean trap 
stiffness in trap 2 (K2=0.029 + 0.007 pN/nm), leading to an overall trap 
stiffness of «= 0.0159 pN/nm (brown line). Given these parameters, 

the predicted equilibrium force in the optical trap for Rab5-bound EEA1 
is ~0.6 pN and the predicted equilibrium extension ~160 nm. i, Force 
changes upon capture for Rab5:GTP-bound EEA] and the extended 

and swapped variants. Force was measured from change-point analysis 
for transient interactions between EEA1 beads and Rab5:GTP beads. 

To test binding per se, the force change for 10 x His-EEA1 beads tethered 
to Ni-NTA beads was similarly determined from established connections. 
For 10x His-EEA]1, no transient interactions could be observed. Median 
change in force and 95% confidence interval from bootstrapping with 
resampling (lower and upper bounds at (2.5%, 97.5%)) were determined. 
EEA1, 0.37 (0.31, 0.46) pN; extended, 0.39 (0.35, 0.42) pN; swapped, 0.45 
(0.41, 0.56) pN; 10x His, 0.19 (0.14, 0.22) pN. j, Capture distances defined 
at the proximal distance upon which transient interactions were observed 
for Rab5-bound EEA1 and the extended and swapped variants. Median 
capture distance and 95% confidence interval from bootstrapping with 
resampling (lower and upper bounds at (2.5%, 97.5%)) were determined. 
EEA1, 168 (141, 182) nm; extended, 195 (189, 199) nm; swapped, 183 
(179, 189) nm; 10x His, 157 (120, 196) nm; n= 60, 93, 27, 24 per condition 
respectively. k, Mechanical work is performed as the tether collapses. The 
mechanical work performed during the relaxation to the new equilibrium 
extension is the integral under the force-extension curve. The exact value 
of the extracted work depends both on the capture distance (the extension 
at the moment of persistence length change) and on the release distance 
(the extension at the moment when Rab5 unbinds). The uncertainties in 
these extensions are different for the two positions, reflecting the different 
longitudinal fluctuations of the rigid or the flexible tether (Afexible= 26m 
(blue arrows), A;igid = 300 nm (magenta arrows)). For example, for a 
relaxation between the capture distance, dcapture ¥ 195 nm and the release 
extension, dyelease © 122 nm, the extracted mechanical work is Wx 14kpT. 
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Extended Data Figure 8 | See next page for caption. 
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Extended Data Figure 8 | EEA1 mutants incapable of undergoing 
entropic collapse result in defects in endosomal trafficking. 

a, b, Automated confocal immunofluorescence images (n = 30 each) 

of HeLa EEA1-KO and standard HeLa cells. EEA1 (green) and Rab5 
(magenta). Scale bar, 10|1m. c, Western blot of HeLa and HeLa EEA1-KO 
clonal cell line for EEA1 and Rab5. d, e, g, h, Automated confocal images 
(n= 30 each) of HeLa EEA1-KO cells expressing no EEA1 (KO, d), 
rescued with wild-type EEA1 (rescue, e) or extended and swapped 
mutants (g, h). Cells were pulsed with fluorescently labelled cargo (LDL) 
(green) for 10 min, fixed and immunostained for Rab5 (magenta) and 
EEA (for EEA1, see Fig. 4). Magnified insets of endosomes are depicted 
at arrows. Scale bar, 10,1m. f, Relative complexity of Rab5 endosomes 
per cell. Each Rab5 endosome is segmented, and the segmented object 
requires a defined number of 2D Gaussian functions, hereby referred to 
as complexity. Relative to wild type, HeLa EEA1-KOs (black line) had a 
significantly reduced number of endosomes of high complexity (>3.0), 
but more endosomes defined simply by one or two Gaussian functions. 
Rescue experiments (red) revealed no significant difference in complexity. 
In contrast, both extended and swapped mutants (blue and green 
respectively) had significantly fewer simple endosomes of low complexity, 
and significantly more of higher complexity. Mean + s.d., n = 30. 


i, Histogram of fluorescence intensity of EEA1 per cell. KO cell lines had a 
sharp peak of intensity at background levels, whereas wild-type HeLa cells 
had a normal distribution. Grey box represents threshold levels of EEA1 
intensity per cell taken for analysis. j-1, EGF uptake experiments. Confocal 
images of HeLa EEA1-KOs expressing wild-type EEA] (rescue, j) or 
extended and swapped mutants (g, h). Cells were pulsed with fluorescently 
labelled EGF (green) for 10 min, fixed and immunostained for EEA1 
(magenta). Images shown are maximum intensity projections. Scale bar, 
5m. m, HeLa EEA1-KO cells in which the swapped EEA1 mutant was 
reintroduced showed clusters of vesicles and more rarely the classical 
endosomal morphology. The clusters were clearly delineated by a zone of 
cytoplasm with a distinct density. Representative of n= 19. Scale bars, 2 zm. 
n, Further quantifications, and the swapped mutant ultrastructural 
phenotype. Fraction of endosomal surface containing filamentous material 
for HeLa and HeLa EEA1-KOs. Box-whisker plot with minimum/ 
maximum values, n = 22, 24 endosomes. **P < 0.01, two-tailed Student’s 
t-test. o, Distance measured between endosome and tethered vesicles 
(HeLa) or between vesicles within large clusters (extended) (surface-to- 
surface, n = 158 and 623 for HeLa and extended respectively; ***P < 1074, 
two-tailed Student’s t-test). 
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Extended Data Figure 9 | Unlabelled version of Fig. 5. 
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from the origin. The end positions therefore resulted in a cloud of 


Extended Data Figure 10 | Bouquet plots of EEA1 and variants. EEA1 in 
empirical positions for the EEA1 N terminus of EEA] (left), and reveal 


the absence of Rab5 is predominantly extended. The initial five segments 
of the curves from rotary shadowing electron microscopy were aligned the overall change in conformational space that can be occupied by EEA1 


and the curves plotted with the end position highlighted (dots). Grey when bound to Rab5:GTP-1S (right). b, Bouquet plots for the extended 
concentric hemispheres demarcate 50, 100, 150 and 200 nm extensions EEAI variant. c, Bouquet plots for the swapped EEA] variant. 
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An endosomal tether undergoes an entropic collapse 


to bring vesicles together 


David H. Murray!*, Marcus Jahnel!?3*, Janelle Lauer!, Mario J. Avellaneda!*+, Nicolas Brouilly!, Alice Cezanne’, 


1 


Hernan Morales-Navarrete!, Enrico D. Perini’, Charles Ferguson’, Andrei N. Lupas°, Yannis Kalaidzidis!, Robert G. Parton*®, 


Stephan W. Grill!?? & Marino Zerial! 


An early step in intracellular transport is the selective recognition 
of a vesicle by its appropriate target membrane, a process regulated 
by Rab GTPases via the recruitment of tethering effectors!*. 
Membrane tethering confers higher selectivity and efficiency 
to membrane fusion than the pairing of SNAREs (soluble 
N-ethylmaleimide-sensitive factor attachment protein receptors) 
alone>-’. Here we address the mechanism whereby a tethered 
vesicle comes closer towards its target membrane for fusion by 
reconstituting an endosomal asymmetric tethering machinery 
consisting of the dimeric coiled-coil protein EEA1 (refs 6, 7) 
recruited to phosphatidylinositol 3-phosphate membranes and 
binding vesicles harbouring Rab5. Surprisingly, structural analysis 
reveals that Rab5:GTP induces an allosteric conformational change 
in EEA1, from extended to flexible and collapsed. Through dynamic 
analysis by optical tweezers, we confirm that EEA1 captures a 
vesicle at a distance corresponding to its extended conformation, 
and directly measure its flexibility and the forces induced during 
the tethering reaction. Expression of engineered EEA1 variants 
defective in the conformational change induce prominent clusters 
of tethered vesicles in vivo. Our results suggest a new mechanism in 
which Rab5 induces a change in flexibility of EEA1, generating an 
entropic collapse force that pulls the captured vesicle towards the 
target membrane to initiate docking and fusion. 

EEA1, as nearly all putative coiled-coil tethering proteins, extends 
more than ten times the length of SNARE proteins®*”. To explain how 
such a long molecule can mediate membrane tethering but also allow 
the membranes to come closer for fusion, we reconstituted a mini- 
mal asymmetric membrane tethering in liposomes containing EEA1, 
Rab5 and different fluorescent tracers (Fig. la and Extended Data 
Fig. 1lb-e). EEA1 binds to phosphatidylinositol 3-phosphate (P1(3)P) 
via its carboxy (C) terminus with high affinity (dissociation constant 
Kax50nM)”!°-!?, and to Rab5:GTP via its amino (N) terminus with 
comparatively lower affinity (Ka~ 2.4{1M)'%. Liposomes containing 
PI(3)P and labelled with RhoDPPE effectively recruited EEA1 and teth- 
ered to DiD-labelled Rab5-6 x His-liposomes, as analysed by confocal 
microscopy (Fig. la—c). The reaction required EEA1, Rab5 and GTP-1S, 
as no co-localization was observed in the presence of GDP. The effi- 
ciency of tethering approached that of biotin-streptavidin liposomes 
(Fig. 1d). Furthermore, no co-localization was observed between pairs 
of liposomes harbouring Rab5 (Fig. le). Therefore, Rab5, EEA1 and 
PI(3)P form a minimal endosomal asymmetric membrane tethering 
machinery. 

In principle, the N terminus of EEA1 could also bind Rab5 in cis: 
that is, on the same membrane. However, the presence of Rab5 on 
both pairs of liposomes, as in early endosomes in vivo, did not inter- 
fere with the tethering activity of EEA1 in vitro, as tethering was 


indistinguishable between the asymmetric and symmetric conditions 
(Fig. 1c, e). Moreover, coiled-coil prediction algorithms estimate a cen- 
tral segment of nearly ~200 nm (refs 14, 15) (Extended Data Fig. 1a), 
suggesting that the molecule adopts an extended conformation. Indeed, 
filamentous EEA 1-positive structures emanating from the surface of 
early endosomes in vivo have been observed by electron microscopy". 
In further support of this interpretation, we visualized the N and C 
termini of EEA1 using specific antibodies by super-resolution micros- 
copy in HeLa cells (Fig. 1f, g, Extended Data Fig. 1f-h and Methods). 
Ifthe N terminus of EEA1 bound Rab5 in cis, it should co-localize with 
the C terminus. Strikingly, the ends of EEA1 could instead be resolved, 
with the N terminus extending radially from the C terminus into the 
cytoplasm. We estimated an end-to-end of distance of 141+47nm 
(mean + s.d.; Fig. 1h), in the range of the predicted length and rigidity 
of coiled-coils. 

To characterize the distances and dynamics of the tethering reaction, 
we generated bead-supported membranes (101m silica microspheres) 
harbouring green fluorescent protein (GFP)-Rab5 (Fig. liand Extended 
Data Fig. 2). These tethered to liposomes containing PI(3)P in the pres- 
ence of GTP-4S but not GDP in an EEA1 concentration-dependent 
manner (Extended Data Fig. 2g, h). Time-lapse microscopy showed 
that some liposomes were captured by the bead-supported membrane, 
while others diffused away (Extended Data Fig. 2iand Supplementary 
Videos 1 and 2), similar to the behaviour of endosomes in vivol®. We 
next measured the distances between the tethered vesicle and GFP- 
Rab5 (Fig. 1j, Extended Data Fig. 2) and Methods). Surprisingly, 
we observed distances ranging from 20nm up to approximately 
the predicted length of 200 nm (mean +s.d.; 84 + 56 nm) (Fig. 1k). 
Such a broad distribution is irreconcilable with the predicted length of 
EEAI and suggests that EEA] may change its conformation. 

We determined the conformation of EEA1 using rotary shadowing 
electron microscopy and image analysis (Fig. 2a). The measurements of 
contour length and mean end-to-end distance followed Gaussian distri- 
butions with an average of 222 + 26 nm (Fig. 2b, top) and 195 +26nm 
(Fig. 2b, bottom), respectively, confirming that the molecule is largely 
extended, as in vivo'! (Fig. 1g, h). However, this is incompatible with 
the much shorter distances between tethered vesicles in vitro (Fig. 1k). 
Therefore, we asked whether binding to Rab5 may cause EEA1 to 
adopt a more compact conformation. Remarkably, this was the case. 
Addition of Rab5:GTP-1S (Fig. 2c) resulted in a significant fraction 
of bent EEA1 molecules having a substantially reduced end-to-end 
distance of 122 +50 nm (Fig. 2d). 

To gain further insights into this mechanism, we generated two 
mutants with alterations in the coiled-coil but retaining the Rab5- 
and PI(3)P-binding domains (Extended Data Fig. 3 and Methods). 
In the extended EEA1 mutant, we removed regions of discontinuity 
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Figure 1 | EEA1, Rab5 and PI(3)P form an asymmetric tethering 
machinery. a, b, Vesicle—vesicle tethering assay. Rho-DPPE liposomes 
harbouring Rab5 (green) tether to DiD-PI(3)P liposomes (magenta) upon 
addition of EEA1 and GTP-1S but not GDP (a, scheme; b, microscopy; 
representative of n= 20). Scale bar, 2 1m. c-e, Analysis of vesicle 
co-localization. Asymmetric (c) and symmetric (e) tethering required 
Rab5, P1(3)P and EEA1, streptavidin-biotin control (d) (mean +s.d., n= 3). 
f-h, In vivo stochastic optical reconstruction microscopy (STORM) 
defines the extension of EEA1. The N-terminal (magenta) and C-terminal 
(green) domains of EEA1 (f) were differentially labelled. Representative 


between heptad repeats creating a more idealized, extended coiled- 
coil. In the swapped EEA1 mutant, we swapped the coiled-coil 
regions between the N and C termini. Electron microscopy analysis 
revealed that the extended mutant was impaired in the Rab5-induced 
conformational change (Fig. 2i and Extended Data Fig. 4a-c). In 
contrast, the swapped mutant was mostly bent, often presented 
kinks, and did not significantly change conformation upon Rab5 
binding (Fig. 2f and Extended Data Fig. 4e-g). These results suggest 
that coiled-coil discontinuities and their physical arrangement are 
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STORM image (g, of n = 22) and quantification of EEA1 extension 

(h, box-whisker plot with median, 25/75 quartiles and minimum/ 
maximum error bars, n = 86, representative experiment) from endosomes. 
Scale bar, 500 nm. i, Bead-supported membrane tethering similar to a 

and b. Representative of n = 20. Scale bar, 21m. j, k, Distance of tethered 
vesicles (magenta) from the membrane (green). The intensity per pixel was 
plotted, fitted to determine the relative distances and quantified (k) (vesicle- 
membrane and Rab5-membrane, representative experiment; box—whisker 
plot as inh, mean +s.d., n= 36 and 14). 


critical for the structure of EEA1 and its Rab5-induced conforma- 
tional change. 

To shed light on how EEA1 adopts a compact conformation upon 
Rab5 binding, we measured the curvature along the contour of mol- 
ecules. We aligned N-terminally MBP-tagged EEA1 and determined 
how the tangents to the contour change by 8 nm steps along the 
contour (Methods and Extended Data Fig. 5). Interestingly, the variance 
of this measure of curvature calculated over the ensemble of molecules 
increased significantly upon Rab5:GTP-S binding (Fig. 2g), indicating 


Figure 2 | EEA1 changes flexibility upon 
Rab5S binding. a, c, i, j, Representative 
examples of rotary-shadowing electron 
microscopy of EEA1 (a), EEA1 + Rab5:GTP-yS 
(c), EEA1-extended (i) and -swapped (j) 
variants. Scale bar, 100 nm; n= 88, n= 212, 
n=90, n= 145, respectively. b, d, Contour 

and end-to-end length histograms for EEA1 
(green, n= 88) and EEA1 + Rab5:GTP-yS 
(magenta, n = 212). e, f, Visual comparison 

of aligned EEA1 proteins. The highlighted 
ends of EEA1 + Rab5:GTP-"S lie significantly 
closer to the origin. Hemispheres demarcate 
50nm. g, Variance of curvature measures along 
the contour of aligned EEA1 + Rab5:GDP 
(green) and EEA1 + Rab5:GTP-7S (magenta) 
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Figure 3 | EEA1 collapse generates a force. a, Scheme of bead-supported 
membranes harbouring EAA1 or Rab5 captured by dual-trap optical 
tweezers. b, c, Traps moved successively closer until interactions (arrows) 
were observed, characterized by increase in force and decrease in variance 
(c). d, Interaction distance consistent with length of extended EEA1. Silica 
microspheres (negative control) in grey. e, Persistence length distributions 
of EEA1 and variants from optical tweezers measurements. f, Force did not 
depend on GTP hydrolysis (P > 0.15); n= 39, 26 respectively. g, Interaction 
duration (log-scale) was prolonged by GTP-1S (P< 10 -*). Mann-Whitney- 
Wilcoxon test (e-g); box—whisker plot with Tukey error bars (e-g). 


that EEA] displays a larger variety of curvatures upon Rab5:GTP bind- 
ing. Such changes occurred along the entire length of the molecule, 
with some regions increasing in flexibility more than others (Fig. 2g), 
but were not observed for the EEA1 mutants (Extended Data Fig. 5f-i). 

Although molecules are adsorbed onto a 2D surface, some aspects of 
their 3D conformations are captured (Methods). Analysis of the kur- 
tosis of the distribution of angles between contour tangents indicated 
that 3D shape fluctuations are retained for the entire contour of EEA1 
in the presence of Rab5:GDP, but only up to 60 nm with Rab5:GTP-7yS 
Methods and Extended Data Fig. 6). Moreover, tangent-tangent 
correlations of the contour in this regime revealed that Rab5:GTP-7S 
binding results in a faster decay. Generally, the worm-like chain 
(WLC) model is used to describe fluctuations in polymer shapes and 
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capture aspects of the physics underlying their shape fluctuations’” 


(Methods). In the WLC model, the polymer is considered a homoge- 
neous molecule with its flexibility determined by a bending stiffness 
reflected in a characteristic length, the persistence length, over which 
correlations between tangents to the contour decay. We applied the 
WLC model to EEA1 and determined an effective persistence 
length of 246 + 42nm for the unbound and 74 + 3 nm for the 
Rab5:GTP-7S-bound ensembles. In contrast, the extended EEA1 mutant 
had similar effective persistence lengths in either state (unbound = 
183 + 13nm and bound = 224 + 25 nm; Supplementary Data Table). 

To corroborate these estimates, we fitted the radial distribution func- 
tions (that is, the probability of observing a given end-to-end distance) 
of the molecules extracted from the electron microscopy data with 
analytical solutions of the WLC model'® (Methods). This showed a 
clear reduction in effective persistence length of EEA1 upon Rab5:GTP 
binding (Fig. 2h). In contrast, the extended EEA1 mutant maintained a 
similar radial distribution regardless of Rab5 (Extended Data Fig. 4d). 

Reducing the persistence length of EEA1 makes the molecule 
flexible. However, the tether is still extended and, therefore, in an 
out-of-equilibrium conformation (Fig. 2e). Asa result, it will undergo 
an entropic collapse, with its end-to-end distance decreasing towards 
a new equilibrium (Fig. 2f). This process generates a force that could 
pull the membranes together (estimated ~3 pN (Methods)). In some 
sense, the extended molecule is like a loaded spring that rapidly recoils 
upon Rab5S binding. 

To provide experimental evidence for entropic collapse of EEA1, 
we made use of high-resolution dual-trap optical tweezers (Methods). 
Two glass 2 |um microspheres coated with membranes were held in optical 
traps (Fig. 3a). One trap was moved closer to the other, in iterative 
cycles of approaching, pausing and retracting (Fig. 3b). At distances 
below 250 nm and at low concentrations of EEA1 (5-40 nM) to ensure 
single-molecule events, we observed transient interactions as a decrease 
in the mean and variance of the distance between the two beads (Fig. 3b, 
red arrows, Fig. 3c and Extended Data Fig. 7a, d). Interactions were 
infrequent, as expected for single molecules and non-existent without 
EEAI, whereas their frequency and duration increased at high concen- 
trations of EEA1 (400 nM) (Extended Data Fig. 7e and Methods). The 
interaction distance was broad (Fig. 3d), with the mean 176 +76 nm 
comparing favourably with rigid EEA1 (Fig. 2b). 

To test the prediction that EEA1 becomes flexible upon Rab5 bind- 
ing, for each tethered molecule we determined its effective persistence 
length from the capture distance, and measured force increase (Fig. 3c) 
and bead displacements using the WLC model (Methods). Strikingly, we 
obtained a median effective persistence length of 23 + 10 nm (Fig. 3e). 
For more than 80% of the molecules the persistence length was no 
more than half of the contour length, confirming that Rab5-bound 
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HeLa EEA1-KO + extended 


BROGAN Rice PAPE Ses a ae 
Figure 5 | Ultrastructural analysis of EEA1 KO and mutant rescue 
cells. a, Dense filamentous network (arrowheads) around an early 
endosome (asterisks) in HeLa. Many smaller vesicular or tubular profiles 
were consistently observed at the network periphery. Representative of 
n= 33.b, A filamentous network was less prominent in HeLa EEA1-KO 
with no obvious concentration of vesicles near the endosomal surface. 
Representative of n = 54. c-e, HeLa EEA1-KO expressing the extended 
EEA] variant showed clusters of vesicles throughout the cytoplasm and 
no classical endosomal morphology. The clusters were clearly delineated 
by a zone of cytoplasm with distinct density (circled areas). Higher 
magnification revealed fine wispy material surrounding the clustered 
vesicles (d, e; arrowheads) and evidence of discrete filaments (between the 
arrowheads in e). Representative of n = 56. Scale bars: a, b, d, e, 500 nm; 

c, 24m. 


EEA1 is flexible. In contrast, the extended EEA1 mutant remained 
significantly more rigid than EEA1 (Fig. 3e). Rab5 binding is neces- 
sary to trigger structural and conformational changes on EEA1. When 
Rab5 was bypassed by His-tag-mediated tethering, EEA] flexibility was 
significantly lower than that of EEA1 with Rab5 (Fig. 3e). 

If EEA1 becomes flexible upon capture, an entropic pulling force 
will be generated. This entropic force balances with the force exerted 
by the optical traps as the molecule undergoes the collapse and as 
the system finds its new equilibrium (Extended Data Fig. 7h)”. For 
a capture distance of 195nm and a peak collapse force of 3 pN, we 
predict a force balance at ~0.6 pN (Methods), consistent with our 
tweezer measurements of 0.5 + 0.3 pN (Fig. 3c). EEA1 binding to Rab5 
requires the GTP-bound form. No significant force differences were 
observed in the presence of the non-hydrolysable analogue GTP-yS 
or GTP (Fig. 3f). In contrast, the duration of the interaction was much 
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prolonged (Fig. 3g), as expected given that GTP-7S stabilizes Rab5 in 
the active form”’. Finally, replacing EEA1-Rab5 binding with 10x 
His-EEA1 tethering to Ni- NTA-beads resulted in a decreased collapse 
force (Extended Data Fig. 7i). 

To validate in vivo the mechanism observed in vitro, we genome- 
edited HeLa cells to disrupt the EEA1 gene (HeLa EEA1-KO; Fig. 4a, 
Extended Data Fig. 8c and Methods), and analysed the distribution of 
Rab5-positive endosomes and the uptake of cargo (low-density lipo- 
protein (LDL)) by confocal microscopy (Fig. 4a). HeLa EEA1-KO dis- 
played a significant reduction in Rab5 endosome size, particularly for 
the largest endosomes (Fig. 4c), and a marked decrease in cargo (LDL) 
uptake (Fig. 4f). Expression of EEA1 rescued the normal, rounded 
morphology of endosomes (Fig. 4b and Extended Data Fig. 8f, i) and 
LDL uptake (Fig. 4c). In contrast, the expression of both extended and 
swapped EEA1 mutants generated enlarged endosomes and inhibited 
cargo uptake (Fig. 4c-f). 

Because the size of endosomes is below the resolution limit of light 
microscopy, we performed electron microscopy on the HeLa EEA1-KO 
cells (Fig. 5 and Extended Data Fig. 9). The filamentous material on 
endosomes'! was much reduced in HeLa EEA1-KO cells (Fig. 5a, b, and 
Extended Data Fig. 8n) and restored by the re-expression of EEA1 on 
endosomes that appeared normal or enlarged, consistent with the light 
microscopy analysis (Fig. 4b). Strikingly, cells expressing the extended 
EEA1 mutant had large (>1 um) clusters of small vesicles, within areas 
filled with filamentous material (Fig. 5d, e), suggesting that they are 
arrested in a tethered state (Fig. 4d, e). The distance between the teth- 
ered vesicles was significantly longer than that between endosomes 
in control cells (Extended Data Fig. 80), consistent with the mutant 
EEA1 being incapable of undergoing entropic collapse to shorter dis- 
tances (Figs 2e and 3e). Similar endosomal clusters were induced by 
the swapped mutant (Extended Data Fig. 8m). 

Our data suggest a new mechanochemical cycle of EEA1 regulated 
by Rab5:GTP binding and GTP hydrolysis. On early endosomes, EEA1 
is in the extended state (Fig. 2e) and increases the probability of cap- 
turing a vesicle bearing RabS. Similarly, it forms a Rab5-selectivity 
barrier (analogous to a polymer brush)*!. When Rab5 on an incoming 
vesicle binds EEA1, it induces an allosteric conformational change, 
from extended to flexible (Fig. 2f). This shows a new function of Rab 
proteins beyond effector recruitment. The reduction in persistence 
length of EEA1 causes its entropic collapse, releasing up to ~14kgT of 
mechanical energy (Extended Data Fig. 7k) and generating up to 3 pN 
of force that could pull the vesicle closer to its target membrane where 
it may diffuse’? or be brought by other Rab5 effectors”** within the 
range of trans-SNARE pairing. This mechanism explains why the Rab5 
machinery dramatically increases the efficiency of SNARE-mediated 
membrane fusion”’. The mechanical energy released by EEA1 is of 
the order of the free energy released by GTP hydrolysis. However, 
the energy required to complete the cycle could potentially also come 
from chaperones. 

A key question is how Rab5 can induce such a long-range allosteric 
effect. This is not uncommon among coiled-coil proteins**”°. The 
entropic collapse mechanism is different, however, for other mem- 
brane tethering factors”’. In the course of this study, the GCC185 
tether was shown to bend through central joints*”. For EEA1, instead 
(1) the arrangement and structure of the coiled-coils and (2) Rab5 
binding are critical for the propagation of allosteric conformational 
changes (Extended Data Fig. 10). We can envisage different mecha- 
nisms (see Supplementary Discussion), such as local register shifts. 
In dynein, dynamics in the heptad register prove critical to func- 
tionally link ATP binding and microtubule binding at opposite ends 
of its coiled-coil stalk’*’’. Further ad hoc structural studies are nec- 
essary to resolve this outstanding problem. The entropic collapse 
upon stiffness reduction could be an effective and general mecha- 
nism used not only by membrane tethers but also by many coiled- 
coil proteins for generating an attractive force in diverse biological 
processes. 
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METHODS 


Statistics. Sample size was not predetermined. For cell electron microscopy, 
samples were double-blind examined. Other experiments were not randomized 
or blinded. Box-whisker plots all show median, 25/75 quartiles by box bounda- 
ries and minimum/maximum values by errors, with the exception of Fig. 3 and 
Extended Data Fig. 7 which use Tukey-defined error bars. 

Cloning, expression and purification of proteins. Human Rab5-6 x His and GFP- 
Rab5-6 x His were expressed and purified essentially as previously described in the 
Escherichia coli expression system’. Human Rabex-5 amino-acid residues 131-394 
were PCR and restriction cloned into a pGST-parallel2 vector containing a TEV 
cleavable N-terminal glutathione-S-transferase (GST)”?*°. Expression and purifi- 
cation was performed essentially as described*". Briefly, E. coli-expressed proteins 
were transformed into BL21(DE3) cells and grown at 37°C until absorbance at 
600 nm (A¢o0 nm) of 0.8, whereupon the incubator was reduced to 18°C. After 
30min, cultures were induced with 0.1 mM IPTG and grown overnight (16h). Cell 
pellets were resuspended in standard buffer (20 mM Tris pH7.4, 150mM NaCl, 
0.5mM TCEP) and flash frozen in liquid nitrogen. All subsequent steps performed 
at 4°C or on ice. Cell pellets were resuspended in standard buffer supplemented with 
1mM MgCl, for GTPases, and protease inhibitor cocktail (chymostatin 6 1g/ml, 
leupeptin 0.5 1g/ml, antipain-HCl 10,1g/ml, aprotinin 2 \.g/ml, pepstatin 0.7 g/ml, 
APMSF 10j1g/ml), homogenized and lysed by sonication. Histidine-tagged pro- 
teins were bound in batch to Ni-NTA resin in the presence of 20 mM imidazole, 
and eluted with 200 mM imidazole. GST-tagged proteins were purified on GS resin 
(GS-4B, GE Healthcare) by binding for 2h followed by stringent washing, and 
cleavage from resin overnight. Imidazole-containing samples were immediately 
diluted after elution and tags cleaved during overnight dialysis. Following dialysis 
and tag cleavage, samples were concentrated and TEV or HRV 3C protease was 
removed by reverse purification through Ni-NTA or GS resin. Samples were then 
purified by size-exclusion chromatography on Superdex 200 columns in standard 
buffer. 

Human EEA1 was purified as a GST fusion in a pOEM series vector (Oxford 
Expression Technologies) modified to contain a HRV 3C-cleavable N-terminal 
GST and protease cleavage site or from a modified pFastbacl vector (Thermo 
Fisher Scientific)”*. Some samples were also purified as 6 x His-MBP and 10x His 
fusions from a modified pOEM vector (rotary shadowing for N-to-C terminus 
alignment, and optical tweezer control, respectively; all other experiments 
performed with tags removed). Mutants were purified identically to wild-type 
EEA. 

SF9 cells growing in ESF921 media (Expression Systems) were co-transfected 
with linearized viral genome and the expression plasmid and selected for high 
infectivity. P1 and P2 virus was generated according to the manufacturer's protocol, 
and expression screens and time courses performed to optimize expression yield. 
Best viruses were used to infect 1-2 1 SF9 cells at 10° cells/ml at 1% vol/vol and 
routinely harvested after 40-48 h at about 1.5 x 10° cells/ml, suspended in standard 
buffer and flash frozen in liquid nitrogen. Pellets were thawed on ice and lysed by 
Dounce homogenizer. Purification took place rapidly in standard buffer at 4°C 
on GS resin in batch format. Bound protein was washed thoroughly and cleaved 
from resin by HRV 3C protease overnight. Proteins retaining 6 x His-MBP tags 
were purified on amylose resin and eluted with 10 mM maltose. Protein retaining 
10x His were eluted from Ni-NTA resin in standard buffer supplemented with 
200 mM imidazole. All EEA1 and mutants were immediately further purified by 
Superose 6 size-exclusion chromatography where they eluted as a single peak. All 
experiments were performed with a preparation confirmed for Rab5 and PI(3)P 
binding. Concentrations were determined by UV280 and Bradford assay. All 
proteins were aliquoted and flash frozen in liquid nitrogen and stored at —80°C. 

EEA] variants extended and swapped were synthesized genes optimized for 
insect cell expression (Genscript). The extended mutant has regions of low coiled- 
coil prediction removed, resulting in an EEA1 construct 1,286 amino acids in 
length (versus 1,411 in wild-type EEA1) (see Extended Data Fig. 3). The swapped 
mutant has the C-terminal portion of the coiled-coil rearranged to follow the 
N-terminal Zn?*-finger domains, and the N-terminal portion of the coiled-coil 
therefore rearranged to the C-terminal region of EEA1. Variants were treated iden- 
tically to wild-type EEA1 in purification. 

Static light scattering. An autosampler equipped Viskotek TDA Max system was 
used to analyse the light-scattering from purified EEA1. Sample was loaded the 
autosampler and passed through a TSKGel G5000PW column (Tosoh Biosciences) 
and fractions were subjected to scattering data acquisition. Data obtained were 
averaged across the protein elution volume and molecular masses determined in 
OmniSEC software package. 

Lipids. The following lipids were purchased and used directly: DOPC, DOPS, 
DOGS-NiNTA, RhoDPPE (Avanti), DiD (Invitrogen) and PI(3)P (Echelon 
Biosciences). Lipids were dissolved in chloroform, except PI(3)P in 1:2:0.8 
CHCl;:MeOH:H30. All were stored at —80°C. 


Rab5/PI(3)P binding by EEA1. Early endosome fusion assay was performed as 
previously described". To assess the ability of EEA1 to bind competently in a 
GTP-dependent manner to Rab5, Rab5 was bound to GS resin and subsequently 
loaded with nucleotide (GDP, GTP-yS) as previously described®. Binding of 
EEA1 and all variants to immobilized Rab5 proceeded forl1 h at room temper- 
ature, and the washed Rab5 resin was evaluated for EEA1 binding by western 
blot. Similarly, the binding of EEA1 to PI(3)P containing liposomes was evaluated 
as previously described by formation of liposomes composed of DOPC:DOPS 
or DOPC:DOPS:PI(3)P (85:15 or 80:15:5 respectively)*. Briefly, liposomes were 
formed from the hydration of lipids at 1 mM in standard buffer, and combined 
with EEA] for 1h before ultracentrifugation to separate supernatant and pellet 
for western blotting to evaluate EEA1 sedimentation. Rabbit anti-EEA1 antibody 
was made in our laboratory. 

Preparation of liposomes. Liposomes were formed by extrusion as pre- 
viously described**. Liposome compositions for fluorescence microscopy 
tethering assays were DOPC:DOPS:DOGS-NiNTA, DOPC:DOPS:PI(3)P, 
DOPC:DOPS:biotin-DPPE, with RhoDPPE and DiD where applicable. Liposome 
compositions for bead-supported membranes were DOPC:DOPS:DOGS-NiNTA, 
DOPC:DOPS:PI(3)P. Solvent was evaporated under nitrogen and vacuum over- 
night. The resulting residue was suspended in standard buffer, rapidly vortexed, 
freeze-thawed five times by submersion in liquid N2 followed by water at 40°C, and 
extruded by 11 passes through two polycarbonate membranes with a pore diameter 
of 100 nm (Avestin). Vesicles stored at 4°C were used within 5 days. 
Bead-supported bilayer preparation. Silica beads (2 1m NIST-traceable 
size-standards for optical tweezers, or 10|1m standard microspheres for microscopy; 
Corpuscular) were thoroughly cleaned in pure ethanol and Hellmanex (1% sol., 
Hellma Analytics) before storage in water. Supported bilayers were formed as previ- 
ously described with modifications**. Liposomes composed of DOPC:DOPS 85:15 
(with 5% PI(3)P and DOGS-NiNTA where applicable) were added to a solution 
containing 250mM NaCl for tethering assays (101m) and 100 mM for optical 
tweezers (2jum), and 5 x 10° beads. Liposomes were added to final concentration of 
100\1M and incubated for 30 min (final volume 10011). Samples were washed with 
20 mM Tris pH7.4 three times by addition of 1 ml followed by gentle centrifugation 
(at 380g). Final wash was with standard buffer. Salt concentrations were optimized 
by examination of homogeneity at the transverse plane followed by examination 
of the excess membrane at the coverslip plane (see Extended Data Fig. 2a—d). 
We found that the membranes were extremely robust in conditions where the 
bilayer is fully formed, and could be readily pipetted and washed, consistent with 
previous reports*°. Membrane-coated beads were used within 1h of production 
and always stored before use on a rotary suspension mixer. 

Confocal microscopy of vesicle-vesicle tethering assay. Glass coverslips were 
cleaned in ethanol, Hellmanex and thoroughly rinsed in water. In these experi- 
ments, the following concentrations were used: 1 nM Rabex-5 (131-394), 100nM 
Rab5-6 x His, 120nM EEA1. Experiments were performed in standard buffer with 
5mM MgCl and 11M nucleotide. Liposomes and proteins were pre-mixed in 
low-binding tubes at concentrations indicated, incubated for 5 min and imaged 
immediately upon addition to the coverslip. Images were acquired with a Nikon 
TiE equipped with a 60x plan-apochromat 1.2 numerical aperture W objective 
and Yokagawa CSU-X1 scan head. Images were acquired on an Andor DU-897 
back-illuminated CCD. Acquired images were processed by the SQUASH package 
for Fiji®”. 

Confocal microscopy of bead-supported membrane tethering assay. A 20011 
observation chamber (,1-Slide 8 well, uncoated, #1.5, ibidi) was pre-blocked with 
BSA (1 mg/ml in standard buffer) for 1.5-2h and washed thoroughly. Finally, 180 il 
of standard buffer containing beads was added to the sample chamber. In these 
experiments, the following concentrations were used: 1nM Rabex-5 (131-394), 
100 nM GFP-Rab5-6 x His, and the given EEA1 concentrations (between 30 
and 400 nM). Nucleotide control experiments were performed at 190nM EEA1. 
Experiments were performed in standard buffer with 2mM MgCl, and 1mM 
nucleotide. Altogether Rab5, Rabex5, nucleotide, EEA1 and buffer were mixed 
in low-binding tubes at concentrations indicated, and were added to 240 1] final 
volume to assure mixing throughout the chamber volume. 

Images for co-localization analysis were acquired with a Nikon TiE equipped 
with a 60 plan-apochromat 1.2 numerical aperture W objective and Yokagawa 
CSU-X1 scan head. Images were acquired on an Andor DU-897 back-illuminated 
CCD. Acquired images were processed by the SQUASH package for Fiji*”. 

Data obtained for distance measurements were acquired in the same way and 
processed in Fiji by determining line profiles eight pixels wide from the centre of the 
bead outwards over an observed vesicle. These profiles were fitted with a Gaussian 
distribution. The alignment of the microscope was confirmed by imaging of 
sub-diffraction beads, revealing no clear systematic shift and a maximum positional 
error of 21 nm determined in Motion Tracking!®. Controls with sub-diffraction- 
sized multicolour particles (Methods) and distance measurements between Rab5 
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itself and its resident membrane were within the measurement error of the tech- 
nique (approximately 15nm)**. 

Super-resolution imaging of EEA1 termini. HeLa cells were stained using primary 
antibodies against EEA1 N terminus (610457, prepared in mouse, BD Biosciences) 
and EEA1 C terminus (2900, prepared in rabbit, Abcam). The secondary antibodies 
were anti-mouse Alexa568 antibody (A-11004, prepared in goat, Life Technologies) 
and anti-rabbit Alexa647 (A-21244, prepared in goat, Life Technologies). 
Coverslips were mounted in STORM buffer (100 mM Tris-HCl pH8.7, 10 mM 
NaCl, 10% glucose, 15% glycerol, 0.5 mg/ml glucose oxidase, 40 1g/ml 
catalase, 1% BME) and sealed with nail polish. Cells were imaged on a Zeiss Eclipse 
Ti microscope equipped with a 150 mW 561 nm laser and a 300 mW 647 laser. 
For imaging, lasers intensities were set to achieve 50 mW at the rear lens of the 
objective. Illumination was applied at a sub-TIRF angle through the objective 
to improve the signal to noise ratio. Videos of 24,000 frames (12,000 frames per 
channel) were acquired by groups of 6 consecutive frames using the NIS Elements 
software (Nikon). Images were aligned using 100 nm Tetraspeck beads (Thermo 
Fisher). This software was also used for peak detection and image reconstruction. 
The localization of the EEA1 termini could be distorted a maximum of approxi- 
mately 20 nm owing to the size of the antibodies. The localization accuracy of the 
secondary antibody was ~25 nm. Measured distances were determined in Fiji and 
represent distances between respective centres-of-mass. Representative experiment 
is shown, n=3. 

Sample preparation for optical trap experiments. Bead-supported membranes 
were prepared as described. The concentrations used were as in the microscopy 
experiments: 1nM Rabex-5 (131-394), 100nM Rab5-6 x His and EEA1 concen- 
trations (between 30 and 400 nM). Most experiments were performed at 40nM 
EEA1, with additional trials taking place at 4 and 400 nM. At lowest concentrations, 
single transient events became difficult to observe (<5% had interactions). At the 
highest concentrations, events were often non-transient or repeated. 

Electron microscopy. Samples were rotary-shadowed essentially as described*’. 
Briefly, samples were diluted in a spraying buffer, consisting of 100 mM ammonium 
acetate and 30% glycerol. Diluted samples were sprayed via a capillary onto freshly 
cleaved mica chips. These mica chips were mounted in the high vacuum evapo- 
rator (MED 020, Baltec) and dried. Specimens were platinum coated (5-7.5 nm) 
and carbon was evaporated. Following deposition, the replica was floated off and 
examined at 71,000 magnification and imaged onto a CCD (Morgagni 268D, 
FEI; Morada G2, Olympus). 

Analysis of electron microscopy. Images obtained were processed in ImageJ by 
skeletonizing the particles. Lengths were determined directly from these data and 
represent an overestimation due to the granularity of the platinum shadowing 
(5-7.5nm granules). The bouquet plots were generated by aligning the initial five 
segments of the molecules and the entire population set was plotted. 

To determine the curvature measure, we first took the skeletonized curves and 
smoothed them with a window of 8.2 nm. These curves were then segmented 
with 301 equally spaced points, and these smoothed curves were used for the cur- 
vature calculation. We first attempted to define curvature at one segment length 
(~0.75 nm) but this analysis was too noisy to obtain meaningful description of 
the curves. We therefore determined the curvature by taking the difference of 
the tangents and diving it by the arc length at a distance of ~15 nm (20 points). 
The variance of this measure was determined, and bootstrapping with resampling 
was used to determine errors over the whole population and for 1,000 iterations. 

Although proteins are not homogeneous polymers, the WLC model cap- 
tures essential aspects of the physics underlying their shape fluctuations*“". 
Calculation of fits to all mean tangent-correlations and the equilibration analysis 
were performed using Easyworm source code in Matlab’. First, the original skel- 
etonized curves were segmented with 301 equally spaced points. These data were 
then used to calculate the tangent-correlations and the kurtosis plots. We fitted 
the regime whereby the kurtosis measurement defined that the molecules were 
equilibrated'*“**, This distance therefore varied (see Extended Data Fig. 6, kurto- 
sis plots), but the estimation of persistence length was only weakly dependent on 
this distance. The fitting routines were then implemented up to the thermal equili- 
bration distance with bootstrapping with resampling, which was run for the whole 
population and 1,000 times to obtain errors. These are given as mean + standard 
deviation. For values and fit statistics, please refer to Supplementary Data Table. 
We did not apply the WLC model to the swapped mutant (Extended Data Fig. 4h) 
because of the lack of significant structural changes upon RabS binding (Fig. 2f 
and Extended Data Fig. 4f). 

The analytical fitting to the radial distribution functions was performed in 
Python'®. The radial distribution function for a worm-like chain is the probabil- 
ity density for finding the end points of the polymer. The polymers are considered 
as embedded in a two-dimensional space in this scheme. This treatment adopts 
the continuum model of the polymer, thereby defining the statistical properties 
via free energy calculation. Fitting to analytical solution of the WLC yielded a 
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mean effective persistence length of 270 + 14nm for EEA] alone (mean + error 
of fit), and two populations of effective persistence lengths (26 + 2 nm (67%) and 
300 + 14nm (33%)) for EEA] in the presence of Rab5:GTP-\S. 

Optical tweezer experiments. A custom-built high-resolution dual-trap optical tweezer 
microscope was used*“®, A single stable solid-state laser (Spectra-Physics, 5 W) 
was split by polarization into two traps that could be independently manoeuvred. 
Forces were measured independently in both traps by back-focal plane interfer- 
ometry. Absolute distances between the two traps were determined by template- 
based video microscopy analysis (43 + 2 nm per pixel) and offset-corrected for 
each microsphere pair by repeatedly contacting the microspheres after each exper- 
iment. The template detection algorithm had subpixel accuracy, at an estimated 
uncertainty in absolute distance measurements to be not more than + 20 nm. Bead 
displacement was calculated according to AF= —K Ay. Extended Data Fig. 7g 
demonstrates the sensitivity of the instrument via the Allan deviation“” for aver- 
aging times greater than 100 ms. 

All optical tweezer experiments were performed with 21m silica size-standard 
microspheres (Corpuscular), at a temperature of 26 +2°C in a laminar flow chamber 
with buffers containing 35% glycerol to prevent sedimentation of the silica micro- 
spheres. Thermal calibration of the optical traps was performed with the power 
spectrum method using a dynamic viscosity of 3.1 mPas (ref. 48) (mean trap 
stiffness: trap 1, &; = 0.035 + 0.007 pN/nm; trap 2, #2 = 0.029 + 0.007 pN/nm), 
leading to an overall trap stiffness of k= 0.0159 pN/nm (yellow response curve in 
Extended Data Fig. 7h). Data were acquired at 1 kHz and further processed using 
custom-written software in R. Spurious electronic noise at 50 Hz was filtered using 
a fifth-order Butterworth notch filter from 49 to 51 Hz. 

For probing the interactions of EEA1 with Rab5 without any assumptions on the 
shape of EEA1, a distance agnostic protocol with consecutive cycles of approach- 
ing, waiting (20s) and retraction was used, approaching closer in each iteration 
(Fig. 3b). The stationary segments were then subjected to automatic change-point 
analysis to identify regions of the time series longer than 100 ms with significantly 
different mean and variance’’. Events thus identified were classified as transient 
if the mean and variance went back to base levels within the stationary segment 
(see examples in force traces in Fig. 3c and Extended Data Fig. 7). Mean times 
of interactions were 3.4+0.6s for GTP-)S and 0.9+0.2s for GTP. A fluctuation 
analysis of the differential distance signal during these events gave an estimated 
tether misalignment of less than 30° in all interactions®°. Only transient events 
were further processed. Silica beads alone as a negative control measured a mean 
contact distance of 22 nm (Fig. 3d, grey). 

To calculate the persistence length for individual captured molecules we deter- 
mined the equilibrium extension, z.4, from the capture distance D (nm), the average 
measured force increase upon tethering AF (pN) and the known displacements 
from each trap Ax; = AF/K and Ax; = AF/k2 as Zeq = D— Ax, — Ax. With this 
distance, the persistence length was calculated according to*! 


kpT Zeq 1 q 1 
AF| L 4. 4(— Zeq/L)? 


Similarly, to estimate the magnitude of the entropic collapse force, this formula was 
applied to the equilibrium extensions of EEA1, as estimated by the end-to-end dis- 
tances of the molecules from electron microscopy. Values determined were (median 
and bounds at (2.5%, 97.5%)) EEA], 23 (14, 33) nm; extended, 73 (60, 88) nm; 
swapped, 26 (21, 30) nm; 10x His, 78 (35, 140) nm. Values reported are medians 
and 95% confidence intervals determined from bootstrapping. 

Generation of HeLa EEA1-KO cell line. HeLa EEA1-KO lines were generated 
using CRISPR-Cas9 technology” on HeLa-Kyoto cell lines obtained from the BAC 
recombineering facility at the Max Planck Institute of Molecular Cell Biology and 
Genetics. Cell lines were tested for mycoplasma and authenticated (Multiplexion, 
Heidelberg). pSpCas9(BB-2A-GFP (PX458) and pSpCas9(BB)-2A-Puro (PX459) 
were a gift from F. Zhang (Addgene plasmid 48138, 48139). A PX458 plas- 
mid encoding a GFP-labelled Cas9 nuclease and the sgRNA sequence (from 
GECKO* library 17446, GTGGTTAAACCATGTTAAGG, targeting first exon) 
was transfected into standard HeLa Kyoto cells with Lipofectamine 2000 fol- 
lowing the manufacturer’s instructions. Cells were cultured in DMEM media 
supplemented with 10% FBS and 1% penicillin-streptomycin at 37°C and 5% 
CO,. After 3 days, the transfected cells were FACS sorted by their GFP fluores- 
cence into 96-well plates to obtain single clones and visually inspected°*. These 
clones were then screened by western blotting and in-del formation confirmed 
sequencing of genomic DNA (primer forward, AGCGGCCGTCGCCACCG; 
reverse, TAAGCGCCTGCCGGGCTG). Note the region is extremely GC-rich 
(75%, + 250 nt from targeted indel region). Additionally, a mixed-clonal line 
was obtained by transfection of HeLa Kyoto with PX459 with the above sgRNA 
sequence. After 72h from transfection, cells were exchanged into media sup- 
plemented with 0.5 j1g/ml puromycin (concentration determined in separated 
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experiment) and selected for 3 days. All imaging experiments were confirmed 
on this secondary line. 

Endocytosis rescue assays. Wild-type EEA1 and the extended and swapped 
variants (Extended Data Fig. 3) were cloned into customized mammalian expres- 
sion plasmids under the CMV promoter resulting in untagged proteins. HeLa or 
HeLa EEA1-KO cells were seeded into 96-well plates and transfected (or mock 
transfected) after 48h. Following 48 h after transfection, cells were exchanged into 
serum-free media containing 8.2 1g/ml LDL-Alexa 488 (prepared as previously 
described!®) or 100 ng/ml EGF-Alexa 488 (E13345, Thermo Fisher) for 10 min at 
37°C, and washed in PBS then fixed in 4% paraformaldehyde. 

Automated confocal immunofluorescence microscopy and analysis. Fixed cells 
were stained with antibodies against EEA1 (laboratory-made rabbit) and Rab5 
(610724, prepared in mouse, BD Biosciences) as previously described**. DAPI 
was used to stain the nuclei. Not all early endosomes harbour EEA] (ref. 54) and 
other tethering factors could compensate for EEA1 (refs 24, 55). All imaging was 
performed on a Yokogawa CV7000 s automated spinning disc confocal using a 
60x 1.2 numerical aperture objective. Fifteen images were acquired per well and 
each condition was duplicated at least twice per plate, resulting in 30 or more 
images per condition. 

Image analysis used home-made software, MotionTracking, as previously 

described**°”. Images were first corrected for illumination, chromatic aberration 
and physical shift using multicolour beads. All cells, nuclei and cell objects in 
corrected images were then segmented and their size, content and complexity 
calculated. The intensity of EEA1 in wild-type HeLa cells was measured to deter- 
mine a wild-type intensity distribution. In the rescue experiments, an intensity 
threshold for the transfections was set at about two times the mean of wild-type 
cells (Extended Data Fig. 8i). Experiments were repeated at different seeding den- 
sities with similar results. Given a cell density threshold between 10 and 100 per 
image, we obtained an average of more than 300 cells per condition after filtering for 
the transfection level of EEA1, and more than 15,000 endosomes per experiment. 
A two-tailed t-test was used for significance calculations. 
Cell electron microscopy. Cells in 3cm diameter plastic dishes were processed 
for electron microscopy using a method** to provide particularly heavy staining 
of cellular components. Briefly, cells were fixed by addition of 2.5% glutaraldehyde 
in PBS for 1h at room temperature and then washed with PBS. The cells were 
then processed as described°® with sequential incubations in solutions containing 
potassium ferricyanide/osmium tetroxide, thiocarbohydrazide, osmium tetrox- 
ide, uranyl acetate and lead nitrate in aspartic acid before dehydration and flat 
embedding in resin. Sections were cut parallel to the substratum and analysed 
unstained in a JEOL 1011 transmission electron microscope (Tokyo, Japan). 
Images for quantitation were collected from coded samples (double blind) to 
avoid bias. 

Distance analysis used ImageJ. To correct for thickness of slices (60 nm), the 
following equation was used: 


= 7), PE ae 


where P(r) is the apparent 2D distance distribution, R is the 3D distance, H is the 
thickness of the slice and Z is the normalization constant. Uncorrected distance 
was measured at 119.8 ++ 78.2nm (mean +s.d.), which resulted in 130.0+76.8nm 
corrected. 
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Extended Data Figure 1 | EEA1 is a predicted extended coiled-coil 
dimer that binds Rab5 in a GTP-dependent manner and extends 
outwards from endosomes a, Human EEA1 in COILS prediction reveals 
a clear coiled-structure flanked by the Rab5-binding Zn?*-finger on the 
N terminus and P1(3)P binding FY VE domain on the C terminus. 

b, Coomassie-stained gel of human EEA1 expressed as a GST fusion 

in SF+ insect cells and purified by GS affinity, cleaved on resin, and 
subsequently concentrated and separated from smaller contaminants by 
size-exclusion chromatography on a Superose 6 column. ¢, Static light 
scattering in line with size-exclusion chromatography reveals a molecular 
mass of 323 kDa, compared with a theoretical molecular mass of 326 kDa 
for a dimeric protein. d, Purified protein binds Rab5 in both standard 
and optical tweezer conditions (35% glycerol) in a GIP-dependent 
manner. GST or GST-Rab5 was purified and conjugated to GS resin, and 
subsequently nucleotide was exchanged to either GTP-\S or GDP using 
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EDTA-Mg"*-mediated exchange and subsequent wash. The GST resin was 
then incubated with EEA1 in either the standard or optical tweezers buffer, 
washed three times, and beads were then blotted for EEA1. e, Recombinant 
EEA binds specifically to PI(3)P liposomes. When mixed with 
POPC:POPS 85:15 liposomes, no EEA1 is observed in the liposome pellet 
(CTRL). In contrast, EEA1 is pelleted with control POPC:POPS:PI(3)P 
80:15:5 liposomes (PI3P). f, The N-terminal Zn*" -finger and C-terminal 
FYVE domain of EEA1 were differentially labelled with specific antibodies 
and STORM microscopy performed to define their localization in HeLa 
cells. Representative STORM images of EEA1 radial extension from 
endosome of n = 22. Scale bar, 500 nm. g, h, Primary antibody binding 
controls for N and C termini. Primary antibodies for the N (g) and C (h) 
termini were left out of the staining, resulting in no unspecific secondary 
staining for each. Representative of n=5. Scale bar, 500 nm. 
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Extended Data Figure 2 | See next page for caption. 
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Extended Data Figure 2 | Validation of bead-supported lipid bilayers 
for optical tweezers, and bead tethering experiment controls and 
methods. To optimize the conditions for forming supported lipid bilayers 
on the 2-10 1m beads, we systematically investigated the dependence of 
membrane formation on salt and liposome concentration. a, Fluorescent 
profiles of supported lipid bilayer bead cross sections. At high liposome 
concentration (100M, solid line) during formation of the bilayer on 

the silica bead, the bead-supported membrane fluorescence intensity is 
circumferentially homogenous. At lower lipid concentrations (10 and 11M, 
dashed and dotted lines), less than full coverage is achieved and the 
supported bilayer is inhomogeneous. b, Consistent with previous reports, 
increasing salt concentrations result in more homogenous membrane 
coverage. c, Representative examples of the ‘spilled-out’ membrane of 
beads prepared at 100 mM (top, blue) and 250 mM (bottom, red) NaCl 
salt and 100m liposomes, of n=5. d, Histogram of the size of membrane 
spilled from the beads onto the substrate when prepared at 100 and 
250mM NaCl (blue and red, respectively). This indicated that the lower 
salt samples (blue) were homogenously covered with membrane and 

that they had little excess present, and therefore the optimal conditions 
for formation of membrane on the silica beads used in tethering and in 
optical tweezer experiments. e, Segmentation of beads and vesicles by 
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the SQUASH method. Bead-supported bilayers and vesicles (green and 
magenta, respectively) were segmented as illustrated by red outlines to 
determine their co-localization. Representative of n= 1 generated for 
schematic. f, Methodology comparison for co-localization in GDP and 
GTP-\$S conditions. All methods give P < 0.01 in a two-tailed Student’s 
t-test. Co-localization by signal is better than by size or object, as vesicles 
become undercounted at high concentrations. Mean +s.d.,n=5. 

g, Co-localization of liposomes (PI(3)P, magenta) to the bead-supported 
membrane (GFP-Rab5, green) was strictly dependent on GTP-\S. 
Box-whisker plot with minimum/maximum error, n=5.h, The 
co-localization of liposomes to the supported membrane was dependent 
on EEA1 concentration. At higher concentrations of EEA1, co-localization 
approached 100%. These concentrations are within the range of the 
concentration of endogenous protein”*. Mean +s.d., n=5. i, Time-lapse 
micrographs of the bead-supported bilayer labelled with GFP-Rab5 
(green), and a dynamically tethered vesicle (magenta). Vesicles were 
observed to tether and reversibly leave the membrane, as well as diffuse 
about its surface. Images displayed were acquired at 350 ms intervals as 
z-stacks. Representative of n = 1 to acquire video. Scale bar, 2|1m. 

j, Example fits for radial line-profile data. 
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92 15075EEA1 HUMANA/-1411 
EEA 1-Extended/1286 
EEA 1-Swapped/1-1411 
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2.15075FEA1 HUMAN A-1411 
EEA 1-Extended/1-1286 
EEA 1-6w apped/-1411 


sp215075BREA1 HUMAN A411 
EEA 1-£xtended/11286 
EEA 1-Swapped/11411 


spQ15075FEA1 HUMANA4411 
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EEA 1-Swapped/1-1411 


$92 15075EEA1 HUMANA-1411 
EEA 1£xtended/-1286 
EEA 1-Swapped/-1411 


$92 15075EEA1 HUMANA/-1411 
EEA 1-Extended/11286 
EEA 1-8wapped/-1411 


Amino Acid Amino Acid 


MLRRILQRT PGRVGSQGSDLDSSATP INTVDVNNE SGHGGE SNBADRRDDVTDEROEVODEOASDKE 101 
MLRRILQRT PGRVGSQGSDLDSSATPINTVDVNNE SGHGGESNLALKRDDVTLLRQEVQDLQA SLKE 101 
MLRRILQRTPGRVGSQOGSDLDSSATP INTVDVNNE SGHGGE SN Ci iC tie kMOSianSnnG) 101 


EKW Y SEBLKKELEKYQG LOQQEAK PDGLV TD SSABLQSLEQQLE BAQTEN FN IKQMKDLFEQKAAQLATE IAD TK SKYDBER SLREAAEQKVTRLTE BLINK 202 
EKW YSEELKKELEKYQ GLVTDSSAELQSLEQQLEEAQTEN EN IKQOMKDLFEQKAAQLATE IAD IKSKYDEERSLREAAEQKVTRLTEELNK 192 
202 


EATV IQDLKTELLOR PG TEDVAVLKKELV QVOTLM DNM TLERER 8 SEKLK DECKKLQ SQYASSEAT ISQLRSELAKG POBVAVYVOELOKLKSSVNELTOK 303 
EATVIQDLKT ESEKLKDECKKLQ SQYASSEAT ISQLRSELAKG POEVAVYVQELQKLKSSVNELTOQK 259 
303 


NOQTLTENLLKKEQDYTK LEEKHNEESV SKKN IQATLHQKDLDCQQLQSRLSASETSLHR IHVELSEKG BATQK LKEELSEVETKYQHLKAEFKQLQQQREE 401 
NQOTLTENLLKKEQDYTKLEEK LHR IHVELSEKGEATOKLKEELSEVETKYQHLKAEFKQLOQQQREE 325 
404 


KEQHGLQLQSE INQLH SKLLETERQLGEAHGRLIKEQRQLSSEKLM DKEQQVADLQLK L SRLEEQLKEKV TN STELQHQLDKTKQQHQEQQALQQSTTAKLR 505 
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Extended Data Figure 3 | Structure prediction and sequence description 
of EEA1 mutants. a, COILS prediction for extended EEA1 mutant, 
revealing removal of most of the discontinuities in the coiled-coil. 

b, c, The swapped EEA1 mutant has a rearranged coiled-coil. The coiled-coil 
was split as indicated by red triangles in the original EEA1-WT (b), and 
the two regions a (shaded green) and b (shaded magenta) were rearranged 
in a synthetic gene, producing the swapped EEA] variant maintaining 

the features and sequence of the original coiled-coil, but in an alternative 


location (c). d, Full sequence alignment for human EEA1 and the extended 
and swapped mutants used in the study. The crystal structure (Protein 
Data Bank accession number 3MJH) for the Zn?*-finger domain is 
marked in dark blue close to the N terminus. Segment a of the coiled-coil 
region is marked in green, and segment b in magenta. The crystal structure 
(Protein Data Bank accession number 1JOC) of the C-terminal FY VE 
domain and portion of the coiled-coil is marked in cyan. Details of the 
mutant constructs are found in the Methods. 
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Extended Data Figure 4 | Extended and swapped EEA1 mutants exhibit 
limited changes in the presence of Rab5:GTP-‘S. a, e, Rotary-shadowed 
EEA 1-extended particles and EEA1-swapped mutants were skeletonized 
and analysed in ImageJ for contour length (top), resulting in normally 
distributed contour length histograms. The end-to-end length 

histograms (bottom) are similarly distributed. These data were collected 
on N-terminally MBP-tagged samples. Compare with wild-type in Fig. 2b, d; 
n= 212 for the extended and n= 93 for the swapped variants. 

b-d, f, g, The EEA1 mutants revealed limited changes to their curvature 
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in the presence of Rab5:GTP-\S (b, f; compare Fig. 2i, j), and therefore 
minor changes to their contour and end-to-end length histograms (c, g) 
and radial distribution plots (d, h); n = 80 for the extended and n = 47 for 
the swapped variants. i, j, Rotary-shadowing electron microscopy of EEA1 
in the presence of Rab5:GDP (n= 90), N-terminally MBP-tagged, revealed 
no change in appearance compared with the absence of Rab5 entirely 

(Fig. 2a), and no effect of N-terminal tagging relative to wild-type EEA1. 
k, Radial distribution function of EEA1 in the presence of Rab5:GDP 


(compare d, h; Fig. 2g); n= 90. 
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Extended Data Figure 5 | See next page for caption. 
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Extended Data Figure 5 | Representative segmentation, smoothing 

and signed curvature measures for EEA1, and averages for EEA1 and 
mutants. EEA] and EEA1 mutants were skeletonized and smoothed using 
a moving average filter with a window of 8.2 nm, segmented to 300 equally 
spaced segments and aligned N terminus to C terminus by recognition of 
an N-terminal MBP-tag. Their curvature was calculated at 15 nm distances 
along the length of the proteins and plotted. a—c, Representative examples 
of rotary shadowing derived EEA1 curves. The original data appear in 

the first panel, with the second panel revealing the data after smoothing 
for comparison (Methods). The curvature measure, determined by 

how the tangents to the contour change at a distance of 15 nm along the 
contour is plotted below. Note that the choice of sign for the curvature 
measure is arbitrary for each molecule. d, e, Curvature measure and 
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variance of this measure for EEA1 in the presence of Rab5:GDP (green) 
and EEA1 in the presence of Rab5:GTP-7S (magenta); n = 90, n= 145, 
respectively. Alignment of EEA1 curvature from the electron microscopy 
data reveals an increase in curvature over the length of the molecule 
upon Rab5 binding, whereas the extended and swapped EEA] variants 
show no change. All curvature values were taken to be positive given 
that the N-terminal MBP could be recognized but the handedness of the 
molecule adsorbed to the grid could not be inferred. Bootstrapping with 
resampling at full population size was performed for 1,000 iterations to 
determine errors. f, g, Extended EEA1 variant in the absence (green) and 
in the presence of Rab5:GTP-S (magenta); n = 212, n = 80, respectively. 
h, i, Swapped EEA] variant in the absence (green) and in the presence of 
Rab5:GTP-7S (magenta); n = 93, n= 47, respectively. 
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Extended Data Figure 6 | Detailed persistence length and equilibration 
analysis for EEA1 and variants. To validate the methodology used for 
analysis of the persistence lengths, and to assure internal consistency 

in analysis methods, we systematically applied the analysis to EEA1 

(and mutants, see Supplementary Data Table). The skeletonized curves 
were segmented to 300 equally spaced segments, where 0 describes the 
angle between segments. The tangent-tangent correlations were then 
determined for the entire ensembles. a-h, To determine the molecular 
equilibration of EEA1 and variants from 3D to 2D, the kurtosis of the 
theta distribution (top) was calculated. Full equilibration to 2D gives a 
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value of 3.0, and for 3D the expected value is 1.8 as the angle distributions 
become Gaussian. As expected, the measured kurtosis is approximately 
3.0 until lengths above the persistence length of the molecule, where 

the equilibration begins to fail. The value at which the kurtosis began to 
diverge from 2D was taken as the limit for subsequent measurements, as 
beyond this limit (red shaded region) 3D fluctuations are not retained and 
as such the consequences of surface adsorption are uncertain. Next, the 
tangent-tangent correlation was calculated across the ensemble and fitted 
up to the divergence of the kurtosis (red shaded region). 
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Extended Data Figure 7 | See next page for caption. 
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Extended Data Figure 7 | Supplementary data related to optical 
tweezer experiments. a, Change-point analysis was used to identify 
changes in the mean and variance of the combined force signal. An 
example plot of averaged force (linear combination of signals from both 
traps) with respect to time. Data have been collected at 1 kHz. Two long 
transient interactions can be clearly identified. b, c, Cross-correlation 

of the force signals from each trap are not sufficient to reveal stepwise 
interactions as they are time-averaged. By applying cross-correlation over 
a correlation window of 0.8s (b) or 0.3s (c), long transient interactions 
(that is, at ~4s) could be identified. However, an unbiased identification 
of short transients (that is, at ~9s) by this method was not possible. All 
identified long transient interactions showed characteristic changes in 
the cross-correlation: anti-correlation as beads are pulled together, and 
correlation after tethering was established. d, Change-point analysis 

was used to detect both changes in mean and variance of the combined 
force signal, and thereby identify transient interactions (red line). This 
procedure has the additional advantage of defining clear boundaries 

to stepwise processes. e, The possibility of multiple tethers taking part 

in the reaction was observed. Averaged force trace for wild-type EEA1 
occasionally showed signals consistent with multiple interactions (cyan), 
in addition to single transient interactions (red). f, Zoom into time series 
around the transient interaction identified in the previous panel. To a 
first approximation, the dynamic interactions were fitted as piecewise 
constant steps (red). Note also two very short (<10 ms) spikes of similar 
magnitude (to the left and right of identified interaction) occurred but are 
not used in further analysis. Only transients with a duration longer than 
100 ms were analysed. g, To illustrate the sensitivity of the optical tweezer 
experiments, a noise analysis was performed on the segment outlined in 
the top panel (yellow, labelled Allan analysis). The Allan deviation (square 
root of Allan variance, in piconewtons) gives a threshold for detecting a 
signal change over different averaging windows. All detected transients 
(blue) are at minimum an order of magnitude above this threshold. 

To provide perspective, the transient in the above example is indicated 

as a red dot. h, The entropic collapse force is balanced in the tweezer 
experiments below its peak value. The balance between the average 
restoring force in the optical traps (brown) and the entropic collapse force 
of EEA1 (blue) in the bound state gives the measured equilibrium force 


and extension (red dot). The schematic assumes the measured capture 
distance of 195 nm, a persistence length in the Rab5:GTP-bound state of 
Ap = 26 nm, and a contour length of 222 nm. The overall trap response 

of the dual-trap system is treated as two springs in series with the mean 
trap stiffness in trap 1 (K; =0.035 +0.007 pN/nm) and the mean trap 
stiffness in trap 2 (K2=0.029 + 0.007 pN/nm), leading to an overall trap 
stiffness of «= 0.0159 pN/nm (brown line). Given these parameters, 

the predicted equilibrium force in the optical trap for Rab5-bound EEA1 
is ~0.6 pN and the predicted equilibrium extension ~160 nm. i, Force 
changes upon capture for Rab5:GTP-bound EEA] and the extended 

and swapped variants. Force was measured from change-point analysis 
for transient interactions between EEA1 beads and Rab5:GTP beads. 

To test binding per se, the force change for 10 x His-EEA1 beads tethered 
to Ni-NTA beads was similarly determined from established connections. 
For 10x His-EEA]1, no transient interactions could be observed. Median 
change in force and 95% confidence interval from bootstrapping with 
resampling (lower and upper bounds at (2.5%, 97.5%)) were determined. 
EEA1, 0.37 (0.31, 0.46) pN; extended, 0.39 (0.35, 0.42) pN; swapped, 0.45 
(0.41, 0.56) pN; 10x His, 0.19 (0.14, 0.22) pN. j, Capture distances defined 
at the proximal distance upon which transient interactions were observed 
for Rab5-bound EEA1 and the extended and swapped variants. Median 
capture distance and 95% confidence interval from bootstrapping with 
resampling (lower and upper bounds at (2.5%, 97.5%)) were determined. 
EEA1, 168 (141, 182) nm; extended, 195 (189, 199) nm; swapped, 183 
(179, 189) nm; 10x His, 157 (120, 196) nm; n= 60, 93, 27, 24 per condition 
respectively. k, Mechanical work is performed as the tether collapses. The 
mechanical work performed during the relaxation to the new equilibrium 
extension is the integral under the force-extension curve. The exact value 
of the extracted work depends both on the capture distance (the extension 
at the moment of persistence length change) and on the release distance 
(the extension at the moment when Rab5 unbinds). The uncertainties in 
these extensions are different for the two positions, reflecting the different 
longitudinal fluctuations of the rigid or the flexible tether (Afexible= 26m 
(blue arrows), A;igid = 300 nm (magenta arrows)). For example, for a 
relaxation between the capture distance, dcapture ¥ 195 nm and the release 
extension, dyelease © 122 nm, the extracted mechanical work is Wx 14kpT. 
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Extended Data Figure 8 | See next page for caption. 
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Extended Data Figure 8 | EEA1 mutants incapable of undergoing 
entropic collapse result in defects in endosomal trafficking. 

a, b, Automated confocal immunofluorescence images (n = 30 each) 

of HeLa EEA1-KO and standard HeLa cells. EEA1 (green) and Rab5 
(magenta). Scale bar, 10|1m. c, Western blot of HeLa and HeLa EEA1-KO 
clonal cell line for EEA1 and Rab5. d, e, g, h, Automated confocal images 
(n= 30 each) of HeLa EEA1-KO cells expressing no EEA1 (KO, d), 
rescued with wild-type EEA1 (rescue, e) or extended and swapped 
mutants (g, h). Cells were pulsed with fluorescently labelled cargo (LDL) 
(green) for 10 min, fixed and immunostained for Rab5 (magenta) and 
EEA (for EEA1, see Fig. 4). Magnified insets of endosomes are depicted 
at arrows. Scale bar, 10,1m. f, Relative complexity of Rab5 endosomes 
per cell. Each Rab5 endosome is segmented, and the segmented object 
requires a defined number of 2D Gaussian functions, hereby referred to 
as complexity. Relative to wild type, HeLa EEA1-KOs (black line) had a 
significantly reduced number of endosomes of high complexity (>3.0), 
but more endosomes defined simply by one or two Gaussian functions. 
Rescue experiments (red) revealed no significant difference in complexity. 
In contrast, both extended and swapped mutants (blue and green 
respectively) had significantly fewer simple endosomes of low complexity, 
and significantly more of higher complexity. Mean + s.d., n = 30. 


i, Histogram of fluorescence intensity of EEA1 per cell. KO cell lines had a 
sharp peak of intensity at background levels, whereas wild-type HeLa cells 
had a normal distribution. Grey box represents threshold levels of EEA1 
intensity per cell taken for analysis. j-1, EGF uptake experiments. Confocal 
images of HeLa EEA1-KOs expressing wild-type EEA] (rescue, j) or 
extended and swapped mutants (g, h). Cells were pulsed with fluorescently 
labelled EGF (green) for 10 min, fixed and immunostained for EEA1 
(magenta). Images shown are maximum intensity projections. Scale bar, 
5m. m, HeLa EEA1-KO cells in which the swapped EEA1 mutant was 
reintroduced showed clusters of vesicles and more rarely the classical 
endosomal morphology. The clusters were clearly delineated by a zone of 
cytoplasm with a distinct density. Representative of n= 19. Scale bars, 2 zm. 
n, Further quantifications, and the swapped mutant ultrastructural 
phenotype. Fraction of endosomal surface containing filamentous material 
for HeLa and HeLa EEA1-KOs. Box-whisker plot with minimum/ 
maximum values, n = 22, 24 endosomes. **P < 0.01, two-tailed Student’s 
t-test. o, Distance measured between endosome and tethered vesicles 
(HeLa) or between vesicles within large clusters (extended) (surface-to- 
surface, n = 158 and 623 for HeLa and extended respectively; ***P < 1074, 
two-tailed Student’s t-test). 
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Extended Data Figure 9 | Unlabelled version of Fig. 5. 
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from the origin. The end positions therefore resulted in a cloud of 


Extended Data Figure 10 | Bouquet plots of EEA1 and variants. EEA1 in 
empirical positions for the EEA1 N terminus of EEA] (left), and reveal 


the absence of Rab5 is predominantly extended. The initial five segments 
of the curves from rotary shadowing electron microscopy were aligned the overall change in conformational space that can be occupied by EEA1 


and the curves plotted with the end position highlighted (dots). Grey when bound to Rab5:GTP-1S (right). b, Bouquet plots for the extended 
concentric hemispheres demarcate 50, 100, 150 and 200 nm extensions EEAI variant. c, Bouquet plots for the swapped EEA] variant. 
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Small molecule stabilization of the KSR inactive 
state antagonizes oncogenic Ras signalling 


Neil S. Dhawan!*, Alex P. Scopton!?* & Arvin C. Dar! 


Deregulation of the Ras-mitogen activated protein kinase (MAPK) 
pathway is an early event in many different cancers and a key driver 
of resistance to targeted therapies’. Sustained signalling through 
this pathway is caused most often by mutations in K-Ras, which 
biochemically favours the stabilization of active RAF signalling 
complexes”. Kinase suppressor of Ras (KSR) is a MAPK scaffold*° 
that is subject to allosteric regulation through dimerization with 
RAF’, Direct targeting of KSR could have important therapeutic 
implications for cancer; however, testing this hypothesis has been 
difficult owing to a lack of small-molecule antagonists of KSR 
function. Guided by KSR mutations that selectively suppress 
oncogenic, but not wild-type, Ras signalling, we developed a class 
of compounds that stabilize a previously unrecognized inactive 
state of KSR. These compounds, exemplified by APS-2-79, 
modulate KSR-dependent MAPK signalling by antagonizing RAF 
heterodimerization as well as the conformational changes required 
for phosphorylation and activation of KSR-bound MEK (mitogen- 
activated protein kinase kinase). Furthermore, APS-2-79 increased 
the potency of several MEK inhibitors specifically within Ras- 
mutant cell lines by antagonizing release of negative feedback 
signalling, demonstrating the potential of targeting KSR to 
improve the efficacy of current MAPK inhibitors. These results 
reveal conformational switching in KSR as a druggable regulator 
of oncogenic Ras, and further suggest co-targeting of enzymatic and 
scaffolding activities within Ras-MAPK signalling complexes as a 
therapeutic strategy for overcoming Ras-driven cancers. 

Ras is the most frequently mutated human oncogene. Yet, despite 
recent breakthroughs, therapeutic options to target Ras-dependent 
cancers remain limited'. Studies conducted in several different model 
systems support the possibility of Ras-targeted interventions via 
KSR?>8-! However, due to its status as a pseudokinase and role as a 
non-catalytic regulator of core signalling enzymes''~'°, pharmacological 
approaches that target KSR have been lacking. This is in contrast to 
current drug discovery and development efforts that have focused 
extensively on direct inhibitors of the Ras effector kinases RAK, MEK, 
and ERK". 

To explore an alternative form of pharmacological modulation and 
identify Ras-MAPK antagonists via KSR, we focused on large forward 
genetic screens conducted in flies and worms that identified mutant Ras- 
selective suppressor alleles in KSR*-°. The studies in flies alone eval- 
uated approximately 900,000 randomly mutated strains searching for 
genetic modifiers of a Ras(G12V)-dependent rough-eye phenotype”. 
We mapped the suppressor alleles onto the primary sequence of KSR 
(Extended Data Fig. la) and a recently determined X-ray crystal 
structure of the human KSR2 pseudokinase domain in complex with 
MEK1 and ATP, and noted a high concentration of suppressor muta- 
tions immediately adjacent to the KSR ATP-binding pocket (Fig. 1a). 
On the basis of this analysis, we hypothesized that the RAF and MEK 
interaction interfaces in KSR may be uncoupled through ligands that 


engage the KSR ATP-binding pocket. Specifically, we speculated that 
small molecules, which bias KSR towards a state similar to that revealed 
in the KSR2-MEK1-ATP crystal structure, might function as antago- 
nists of KSR-dependent regulation of RAF and MEK. 

To identify active-site-directed ligands of KSR, we screened a collec- 
tion of 176 structurally diverse kinase inhibitors for direct competition 
of an activity-based probe (ATP) that specifically labels the ATP- 
binding pocket of purified KSR2-MEK1 complexes (Fig. 1b, c). From 
this analysis we identified APS-1-68-2 as a competitor of probe-labelling 
of KSR2-MEK1. This quinazoline-biphenyl ether compound has pre- 
viously been described as both a Src and epidermal growth factor recep- 
tor (EGFR) family kinase inhibitor. Synthetic tailoring of APS-1-68-2 
generated highly informative structure-activity relationships (Fig. 1d). 
For example, deletion of the terminal phenyl group (APS-1-82-1) or 
extension of the ether linker (APS-2-12) diminished KSR2-MEK1 
probe competition. Notably, addition of a single methyl group at the 
internal phenyl generated a potent probe compound (APS-2-79; ICs of 
KSR2= 120 + 23 nM), whereas the similar dimethyl substituted com- 
pound (APS-3-77) was essentially inactive (ICs9 of KSR2 > 10,000 nM). 

To assess the biological function of these compounds as Rasg-MAPK 
pathway antagonists, we developed a simplified cell-based reconstitu- 
tion system to directly monitor KSR-driven MAPK signalling (Fig. le). 
This system, in which cellular MAPK signalling is dependent on 
KSR expression, was found to be sensitive to known Ras suppres- 
sor mutations in KSR (Fig. 1f). Likewise, APS-2-79 also suppressed 
KSR-stimulated MEK and ERK phosphorylation (Fig. 1g; P< 0.005 
lanes 1 versus 2). The suppression of MAPK signalling by APS-2-79 
was dependent on direct targeting of KSR as an active site mutant 
(KSR(A690F)), which has previously been demonstrated to stimulate 
KSR-based MAPK outputs independent of ATP-binding”, significantly 
diminished the activity of APS-2-79 (Fig. 1g; lanes 5 versus 6, NS; lanes 
2 versus 6, P< 0.005). Notably, the negative control for KSR-binding 
(analogue APS-3-77; see Extended Data Fig. 2b, c for comparative 
selectivity profiling) was inactive, whereas a positive-control RAF 
inhibitor, dabrafenib, was active irrespective of the KSR-mutational 
status (Fig. 1g). Therefore, on the basis of similarity in phenotype and 
also direct-binding activity, we identify APS-2-79 as a small-molecule 
mimic of KSR alleles that suppress oncogenic Ras mutations. 

KSR-based activity of APS-2-79 as a MAPK antagonist was further 
evaluated using reconstitution assays. Dose-dependent phosphoryla- 
tion of MEK on Ser218/Ser222 by RAF in vitro could be enhanced at 
least fivefold in the presence of KSR (Extended Data Fig. 3a—c). KSR- 
stimulated MEK phosphorylation by RAF was markedly reduced 
by the addition of APS-2-79, but not by APS-3-77 (Extended Data 
Fig. 3d, e). APS-2-79 was inactive when KSR was absent or when the 
KSR2(A690F) mutant was used for in vitro assays (Extended Data 
Fig. 3d, f, g), suggesting that the activity of APS-2-79 derives from direct 
targeting of KSR. Indeed, APS-2-79 lacked direct activity against the 
highly homologous active RAF family kinases, including recombinant 
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Figure 1 | The small molecule APS-2-79 mimics KSR alleles that 
suppress oncogenic Ras mutations. a, Oncogenic Ras-suppressor 
mutations (red) localize to the ATP-binding pocket (yellow), as well as 
RAF- and MEK- interaction interfaces, in KSR. Shown is the putative 
structure of the RAF-KSR-MEK complex’. b, An activity-based probe 
(ATPbi*tit) specifically labels the ATP-binding pockets of purified 
KSR2-MEK1 complexes. 21M of ATP” was incubated with KSR2-MEK1 
in the presence of the indicated concentrations of free ATP. Biotin, 

total MEK, and total KSR western blots are shown. c, A kinase inhibitor 
screen for direct competitors of probe-labelling in purified KSR2-MEK1 
complexes provides informative structure—activity relationships data. 

d, Chemical structures of leads. ICso values (mean + s.d.; n =2 biological 
replicates) against ATP**"™ probe-labelling of KSR2 are listed below 
structures. e, Co-expression of full-length KSR-Flag and MEK1-GFP 


BRAF and CRAB, or cellular BRAF(V600E) (Extended Data Figs 2, 
3, 4a). Therefore, on the basis of reconstitution and selectivity assays, we 
conclude that APS-2-79 functions as an antagonist of MEK phospho- 
rylation by RAF through direct binding of the KSR active site. 

Notably, we found that a previously described ATP-competitive and 
active-state binder of KSR termed ASC24 (ref. 7), in contrast to APS-2-79, 
did not antagonize KSR-dependent MEK phosphorylation by RAF 
(Extended Data Fig. 3d, e), suggesting that inhibition of catalytic activity 
alone in KSR is insufficient to block MAPK signalling. Consistent with 
this notion, removal of putative KSR phosphorylation sites’ in MEK 
neither impeded MAPK signalling nor blocked the inhibitory activity 
of APS-2-79 within cells (Extended Data Fig. 4b). 

Previous studies established that genetic suppressors in KSR may 
impede RAF-induced conformational changes in KSR required 
for MEK activation or destabilize KSR-MEK and KSR-RAF 
complexes®”!”!7-!9. To distinguish between such possible modes of 
action, we determined an X-ray crystal structure of the KSR2-MEK1 
complex bound to APS-2-79 (Fig. 2a). In the APS-2-79-bound state, 
KSR2 binds MEK] in a 1:1 fashion within a quaternary arrange- 
ment that is nearly identical to the ATP-bound state of KSR2-MEK1 
complexes’ (Extended Data Fig. 5). Within both states, KSR2 and 
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leads to enhanced MAPK signalling within 293H cells, as visualized by 
immunoblotting for phosphorylated MEK and ERK. f, MAPK activation 
is sensitive to known genetic suppressor mutations in KSR. A690F is a 
KSR mutant predicted to signal independent of ATP-binding'®. W884D is 
a loss-of-function mutation predicted based on structural analysis. Note, 
human KSR2 numbering used here and throughout. g, APS-2-79 impedes 
KSR-stimulated MAPK signalling within cells by wild-type KSR but not a 
control mutant (KSR(A690F)). Cells were treated with 51M of APS-2-79, 
APS-3-77, or dabrafenib for 2h. In e-g, cells were collected for western 
blot analysis 24h after transfection. Error bars indicate the mean +s.d. 
(n= 3 biological replicates). Signals were normalized relative to lane 1 

(e and g) or 3 (f). NS, not significant. ***P < 0.0005 by two-tailed 
unpaired t-testing. 


MEK1 bind via a face-to-face arrangement mediated largely through 
reciprocal helix aG and activation segment interactions, and KSR2 
homodimerizes through the N-lobe along a crystallographic two-fold 
symmetry axis producing a hetero-tetramer of KSR2-MEK1 dimers. 
In the APS-2-79-bound state, only KSR2 was found to possess strong 
electron density that could be assigned to APS-2-79 (Extended Data 
Fig. 6a, b). Two portions of APS-2-79 engage distinct regions in KSR2. 
First, the bipheny] ether extends to a sub-pocket within KSR2, defined 
by Thr739, Arg692, Asp803 and a hydrophobic shell composed of 
Phe725, Tyr714 and Phe804 (Fig. 2b, c). Stacking interactions between 
the terminal phenyl in APS-2-79, and Phe725, Tyr714 and Phe804 in 
particular are expected to provide strong interactions between KSR2 
and APS-2-79 through the arrangement of a four-member aromatic- 
pair network (Fig. 2b, c). The existence of this network was substanti- 
ated by removal of the terminal phenyl in APS-2-79-like compounds, 
which greatly diminished competition of ATP’°t” probe-labelling in 
KSR2 (Extended Data Fig. 7; APS-1-68-2 versus APS-1-70-1 and APS- 
1-82-1). This network of aromatic-pair interactions, in addition to other 
amino acid substitutions, probably contributes to the selectivity of APS- 
2-79 for KSR over RAF (Extended Data Fig. 6c, e). Second, a hydrogen 
bond between the N1 in the quinazoline core of APS-2-79 and the 
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Figure 2 | Structural analysis of APS-2-79 
bound to the KSR2-MEK1 complex. a, The 
KSR2-MEK1-APS-2-79 complex. Highlighted 
are two key phospho-regulatory residues 

in MEK1, Ser218 and Ser222. b, Magnified 
stereo view of interactions between KSR2 

and APS-2-79. Fy — F, omit map contoured 

at 3.50, generated with APS-2-79 omitted, is 
represented as a blue mesh. c, Schematic of 
the APS-2-79 binding site within KSR2. d, 
Magnified view of the KSR2 active site bound to 


APS-2-79, including the ‘induced lock’ (residues 
1809-Q814; orange). The disordered P-loop is 
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highlighted by a dashed line. e, Overlay between 
the ATP-bound (yellow) and APS-2-79-bound 
states of KSR2. 


Induced lock 
(2-79 bound) 


backbone at Cys742 further mediates APS-2-79-KSR2 interactions. 
Notably, functionalization of the N1 with a methyl group (APS-3-6) 
greatly diminished KSR2-MEK_1 activity, whereas replacement of the 
N3 with -CH (APS-2-16) was moderately tolerated (Extended Data 
Fig. 7). Therefore, on the basis of crystallographic analysis and also 
structure-activity relationships data from our analogue series, APS-2- 
79 binds directly to KSR2 within the KSR2-MEK1 complex. 

In both the APS-2-79- and ATP-bound states of KSR2-MEK1, KSR2 
directly engages the activation segment of MEK], burying the Ser218- 
Ser222 region and presumably shielding this segment of MEK from 
promiscuous phosphorylation. The KSR2-MEK1-APS-2-79 structure 
revealed a portion of KSR2 that was not previously modelled in the ATP- 
bound complex (Extended Data Fig. 6d). This region, encompassing 
residues Ile809 to Gln814, which we refer to as the induced lock, forms 
an extension of the activation segment C terminus to the conserved 
DFG motif, and forms an anti-parallel B-strand with the peptide 
sequence centred around Arg823 in KSR2 (Fig. 2d). Additionally, the 
ordering of residues Ile809 to Gln814 in KSR2 occurs at the expense of 
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disorder of residues 674 to 676 in the P-loop, which in the ATP-bound 
state directly coordinates the 8 and y phosphates (Fig. 2e). The two 
modes by which ATP and APS-2-79 affect KSR-based interactions on 
MEK appear mutually exclusive as both ligands induce conformations 
that would otherwise clash with one another (Fig. 2e). We interpret 
these structures to suggest that APS-2-79 stabilizes an inactive state 
of KSR2 characterized by reinforcement of negative regulatory inter- 
actions. Indeed, APS-2-79 behaves as a KSR-dependent antagonist of 
RAF-mediated MEK phosphorylation by shifting the equilibrium of 
KSR-MEK complexes so as to populate the OFF state (Extended Data 
Fig. 5c). 

Comparison of the ATP-bound and APS-2-79-bound states of KSR2- 
MEK! suggested that APS-2-79 antagonizes RAF phosphorylation on 
MEK indirectly by impeding KSR-RAF heterodimers. As well as APS- 
2-79 binding, the dimer interface of KSR2, including residues Trp685 
and His686, demonstrated perturbations relative to the ATP-bound 
conformation (Fig. 3a, Extended Data Fig. 8). To investigate directly 
the effect of APS-2-79 on KSR2-BRAF dimerization, we used bio-layer 
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Figure 3 | APS-2-79 impedes higher order assembly of RAF-KSR-MEK 
complexes. a, Mapping of residues with a root-mean-square (r.m.s.) 
deviation of >2.0 A between the ATP- and APS-2-79-bound states of 
KSR2-MEK1 (blue) highlights alterations at contact residues Trp685 and 
His686 within the putative KSR-RAF heterodimer interface. b-g, BRAF and 
BRAF mutants (F667E and/or R509H) were immobilized on sensor-heads 
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and KSR2-MEK1 or MEK] assembly was monitored using bio-layer 
inferometry. Association occurred from 0 to 660 s and dissociation was 
monitored thereafter up to 1500s. APS-2-79 was added in the presence 
of KSR2-MEK1 at a concentration of 251M. Ky values represent the 
mean + s.e.m. derived from global fitting of all 5 binding curves. 


00 MONTH 2016 | VOL 000 | NATURE | 3 


© 2016 Macmillan Publishers Limited, part of Springer Nature. All rights reserved. 


LETTER 


HCT-116: KRAS(G12D) 


Bliss score: 295% es 
ES > 
Trametinib (nM) 3 
BR RE8 See 8 | |Z |-DMso 
ecgsc eo oF RBS 8 1M APS-2-79 
wooo 0 938 44 68 78 
= 1 0 1 2 
= 100 0 0 0 66170 log(trametinib (nM) 
2 300 0 0 
o 
@® 1,000 6 6 0 
a 
< 3,000 0 56 67 76 


+ DMSO. 


A549: KRAS(G12S) 
Bliss score: 269% 


ts 


+ DMSO. 
= 1 uM APS-2-79 


50 


Cell viability (96) 


“10 64 2 
logitrametinib (nM)) 


+ DMSO. 


SK-MEL-239: BRAF(V600E) é = 1M APS-2-79 A375: BRAF(V600E) SS = 1 {IM APS-2-79 
Bliss score: -59% 2 Bliss score: 71% 2 
Trametinib (nM) 3 Trametinib (nM) 8 
anon wn Si Z anon wo wo 3: > 
Srertauo s o . SE QE SA a oo SS a 
oooe oo TABS ts) ooo e Go UAB é 
- 940 54 70 8 4+ 0 4 28 - om ) 818 oa 2 
= ini = i 
= 100 Meme 656 6 log(trametinib (nM)) = 100 976 log(trametinib (nM)) 
2 300 a 0 656 0 2 300 60 76 8 
a a 
@ 1,000 Q 0 0718 eh 1,000 fi} B 45 60 
o & 
< 3,000 fv 6 8 3,000 8 60 70 80 85 86 86 8 
b BRAF-mutant KRAS-mutant © HCT-116: KRAS(G12D) SK-MEL-239: BRAF(V600E) 
cell lines cell lines : 1 
Fey ae eae 3 ini 3 iio 3 ino 3 iid 
= apo 2 — Trametle 2 Trame} Sa | g Trametit 2 po | 
4 a [=] fa} a 
E560 ICyq = 40.3 + 2.3 nM [Cop = 8.9 + 1.3 nM 
of fe} 9° 
gg 100 2 | ewer — Cadet teeta | ee |"? oP ee em eran on on 
aS 6 a 
Bs 7 2 1Cgg = 18.4 + 3.4 nM @ _ICy=9.4#2.6nM 
me ~100 28 26 
< OD | rr \eeewerererarewes 9 eee | oo a ae oe oe oe mn! 
4 Re 2 Re 
8 -200 zy ICgp = 18.3 + 1.50 se [Cgp = 10.5 + 3.1 nM 
Vv Ba sana.  eaieeatieedhe ended 33 eee |e eee ewe 
& SS Be 
® pERK tMEK pERK iMEK 
d ¢ ) 
K-Ras K-Ras K-Ras K-Ras 
K-Ras K-Ras 
RAF( R K-Ras K-Ras 
-§ ee . RAPIRA Gs 


Ras-mutant cells 


ee 


Figure 4 | APS-2-79 Enhances the efficacy of the clinical MEK inhibitor 
trametinib within cancer cell lines containing K-Ras mutations. 

a, Dose-responses of APS-2-79 and trametinib (MEKi) on viability of 
K-Ras-mutant (HCT-116, A549) and BRAF-mutant (A375, SK-MEL-239) 
cell lines. Bliss scores represent the mean calculated from two biological 
replicates of the depicted concentration matrices. Numbers listed within 
synergy matrices, which represent the percentage of growth inhibition 
relative to DMSO controls, are the mean of the replicates. Insets highlight 
dose-responses of trametinib in the absence or presence of 11M APS-2-79 
(points along each line represent the mean of two biological replicates). 

b, Synergy, as determined by Bliss independence scores (see Methods), 

for combinations of APS-2-79 and APS-3-77 with trametinib. Error bars 
represent the mean +s.d. of Bliss scores as determined in a and Extended 


inferometry (BLI) to monitor real-time association and dissociation 
of KSR2-MEK1 or free MEK1 to a sensor tethered with immobilized 
BRAE In control experiments, we found that KSR2-MEK1 complexes 
did not associate with immobilized BRAF in a 1:1 fashion (Fig. 3b), 
probably owing to the formation of higher order BRAF-KSR2-MEK1 
complexes. In contrast, BRAF bound to free MEK] in a 1:1 fashion with 
a dissociation constant (Ky) =51+3.8nM (Fig. 3c), which is in close 
agreement to published work”. 

To specifically monitor KSR2-BRAF dimerization relative to other 
possible interactions, we identified a mutation in BRAF(F667E) that 
eliminates binding to free MEK but not KSR2-MEK1 complexes 
(Fig. 3d, e). KSR2-MEK1 interacted in a 1:1 fashion with the 
BRAF(F667E) mutant with a Kg of 1.99 + 0.09 1M; closely matching 
previously published BRAF-BRAF dimerization values®. Notably, the 
addition of a secondary mutation, known to perturb KSR2-BRAF 
dimers (BRAF(F667E/R509H); Fig. 3f), completely abrogated any 
binding signal between KSR2-MEK1 and BRAF. In the presence of 
APS-2-79, the KSR2-BRAF(F667E) dimers did not associate (Fig. 3g), 
consistent with the prediction of the crystal structure suggesting that 
APS-2-79 may impede RAF-KSR dimers. In contrast, the control com- 
pound APS-3-77 did not impede KSR2-BRAF interactions (Extended 
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Data Fig. 9, derived from either K-Ras-mutant and BRAF-mutant cancer 
cell lines (n =5 for each). **P < 0.005 by two-tailed unpaired t-testing. 
c, Pathway analysis suggests that the increased potency of trametinib in 
the presence of APS-2-79 occurs through enhanced downregulation of 
Ras-MAPK signalling (as measured by phospho-ERK). HCT-116 and 
SK-MEL-239 cells were treated for 48 h with increasing concentrations 
of trametinib combined with DMSO, 250 nM, and 141M APS-2-79. ICo9 
values represent the mean + s.d. (n =2 biological replicates). d, Model 
for synergy between the MEK inhibitor (MEKi) trametinib and the KSR 
inactive state binder (KSRi) APS-2-79. APS-2-79 enhances the efficacy 
of trametinib by antagonizing MEKi-induced Ras-MAPK signalling 
complexes. 


Data Fig. 8c). Therefore, we conclude that BRAF can dimerize with 
KSR2-MEK1 complexes directly via KSR2, and this interaction is 
antagonized by APS-2-79. 

Ras mutations occur in approximately 25% of all cancer patients and 
are highly associated with poor response to therapy’. Significant progress 
has been made in targeting BRAF(V600E)-mutant melanoma, however 
RAF and MEK inhibitors have failed to achieve significant clinical efficacy 
in Ras-mutant disease owing in part to mechanisms of inhibitor-induced 
transactivation and feedback, respectively”, MEK- inhibitor feedback has 
been characterized by upstream Ras activation and induction of higher- 
order RAF-RAF and also RAF-KSR complexes”. In an engineered 
cell system, we found that a Ras-suppressor allele (R718H®’) within 
KSR reduced MEK inhibitor-induced feedback (Extended Data Fig. 4c), 
suggesting the possibility that KSR heterodimerization may limit the effi- 
cacy of MEK inhibitors. Owing to the more pronounced role of KSR in 
Ras-mutant, as opposed to RAF-mutant signalling’, and the ability of 
APS-2-79 to impede KSR-RAF heterodimerization, we hypothesized that 
stabilization of the KSR-inactive state (KSRi) via APS-2-79 may potentiate 
the effect of MEK inhibitors by limiting feedback in Ras-mutant models. 
We therefore tested for synergy of APS-2-79 with MEK inhibitors in 
Ras-mutant cell lines, and used RAF-mutant cell lines as controls. 
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We found that APS-2-79 shifted the cell viability dose response to 
trametinib in Ras-mutant cell lines HCT-116 and A549, but not BRAF 
mutant cell lines SK-MEL-239 and A375 (Fig. 4a). Although the cellular 
effects of APS-2-79 alone were modest, combination analysis over full 
concentration matrices revealed that KSRi synergizes with trametinib, 
and other MEK inhibitors (Extended Data Fig. 9a), specifically in 
KRAS mutant cell lines (Fig. 4b). APS-3-77, and additional control 
compounds (Extended Data Fig. 9b and 10), did not demonstrate 
Ras-mutant-specific synergy, supporting the hypothesis that the 
enhanced activity of trametinib when combined with APS-2-79 
depends on co-modulation of KSR. To determine the possible mecha- 
nism for APS-2-79 and trametinib synergy, we examined MAPK signal- 
ling and found that APS-2-79 treatment caused a twofold enhancement 
in the ICo9 of trametinib on ERK phosphorylation in the Ras-mutant 
HCT-116 cell line but not the RAF-mutant SK-MEL-239 cell line 
(Fig. 4c, Extended Data Fig. 4d). The data presented here provide proof- 
of-concept for the use of KSRi to overcome a key liability of a clinical 
MEK inhibitor in K-Ras mutant cells. Indeed, we posit stabilization of 
the KSRi as a mechanism to impede feedback activated Ras-MAPK 
signalling induced by MEK inhibition (Fig. 4d). 

Here we have identified a unique conformation in KSR through the 
discovery of APS-2-79. This compound offers a foundation for the 
development of a new class of targeted therapies based on stabiliza- 
tion of the KSR inactive state. Future efforts will be directed towards 
improving the pharmacological properties of APS-2-79 to enable 
in vivo and clinical studies. In general, the stabilization of conformational 
states with small-molecule modulators may be an effective strategy 
to target other pseudokinases”*”’. Furthermore, the results presented 
here, using KSRi in combination with clinical MEK inhibitors, suggests 
a mechanism to improve the efficacy of inhibitors that target enzymat- 
ically active kinases through co-modulation of pseudokinase-active 
kinase signalling complexes. 


Online Content Methods, along with any additional Extended Data display items and 
Source Data, are available in the online version of the paper; references unique to 
these sections appear only in the online paper. 
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Extended Data Figure 1 | Projection of Ras(G12V) suppressor 

alleles onto the primary and tertiary structure of KSR. a, Schematic 
representation of KSR from Drosophila, Caenorhabditis elegans, and KSR1 
or KSR2 from humans. Suppressor mutations within KSR identified 

from forward genetic screens are highlighted with red stars. Allele names 
and corresponding mutations are given* >>. Two alleles in KSR found 

in the Drosophila screen are shown; one encoding for substitutions in a 
coil-coil SAM domain (CC-SAM) at the N terminus of Drosophila KSR 
($548) and a second mutant in the predicted ATP-binding pocket of the 
KSR pseudokinase domain (S638). Eight distinct alleles were described 

in two separate studies conducted in C. elegans. The vast majority of the 
mutants localize to the pseudokinase domain of KSR and in particular 
ATP-contact residues (yellow). Residues highlighted in red and shown 

in the lower panel correspond to the human KSR2 residue equivalents of 
suppressor mutations found in Drosophila and C. elegans orthologues. 

b, KSR is a scaffold for the Ras-MAPK signalling pathway. Phosphorylation 
of MEK1/2 at Ser218 and Ser222 by RAF, or ERK1/2 via phosphorylation 
at Thr202 and Tyr204 by MEK, are key events in signalling through 
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the Ras-MAPK signalling pathway. c, Purification of the KSR2-MEK1 
complex from insect cells. The KSR2 pseudokinase domain (KSR2(KD)) 
and MEK1 were co-expressed using the SF21 insect cell system. Lysis was 
performed by one freeze-thaw and sonication. Lysates were incubated 
with cobalt resin for 2h and KSR2(KD)-MEK1 was eluted using a high- 
imidazole buffer. Eluate was then incubated with tomato etch virus (TEV) 
protease and \-phosphatase overnight. The mixture was then applied to 
an ion-exchange column (Sp-HP) to separate stoichiometric KSR2-MEK1 
complexes from free MEK1 and TEV. Fractions containing KSR2-MEK1 
were applied to a gel-filtration column for final purification. d, Schematic 
of the ATP*!°% probe-labelling assay on KSR2-MEK1 complexes and 
screen for inhibitors. e, ATP>*"” directly labels KSR2 and MEK] within 
purified complexes. Deconvoluted mass spectrum for KSR2-MEK1 
complexes incubated with ATP>*"", KSR2 and MEK] spectra are included 
in the top and bottom panels, respectively. f, Graphical representation for 
ATPbictin brobe-labelling of KSR2-MEK1 complexes in the presence of 
increasing free ATP as shown in Fig. 1b. Corresponding [C59 values listed 
for both KSR2 and MEK1. 
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(ref. 28); and for ERK, SCH722984 (ref. 29). The listed ICs values include 
mean + s.d. based on two biological replicates. c, APS-3-77 and APS-2-79 
share partially overlapping kinome-wide inhibitory profiles. The graph 


Extended Data Figure 2 | APS-2-79 and APS-3-77 are positive and 
negative binders of KSR2. a, Chemical structures of APS-2-79 and 
APS-3-77 with respective ICs9 values (mean + s.d.; n = 2 biological replicates) 


for KSR2. b, Representative western blot images of in vitro ATP 
competition assays using recombinant MAPK family member proteins. 
Probe-labelling of the indicated kinases were measured in the presence of 
increasing concentrations of APS-2-79, APS-3-77, or a positive control 
compound. For CRAK, BRAF, and BRAF(V600E), the positive control was 
dabrafenib; for MEK1, the ATP-competitive inhibitor termed Wyeth-2b 


shows the percentage of inhibition of APS-2-79 and APS-3-77 (both at 1 1M) 
against 246 kinases. The raw data for this graph is in Supplementary Table 1. 
d, Inset showing the 25 kinases most inhibited by APS-2-79 and APS-3-77. 
Kinases with near-equal sensitivity to these inhibitors as measured here 
include YES1, ERBB4, and FGR; variable sensitivity kinases include CSK, 
HCK, and MERTK. 
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Extended Data Figure 3 | APS-2-79 hinders RAF-mediated MEK 
phosphorylation in a KSR-dependent manner. a, Schematic of the 

RAF phosphorylation assay of free KSR2-MEK1 and MEK1. 

b, Phosphorylation of the indicated concentrations of MEK1 and the 
KSR2-MEK1 complex by BRAF (200 nM) in the presence of 1 mM 

ATP. Representative blots for phospho-MEK (top; as detected using a 
MEK1/2(pS218/pS222) antibody) and total MEK (tMEK; bottom) are 
shown. c, Plots of pMEK versus time (seconds) at various concentrations 
of MEK] and the KSR2-MEK1 complex. Bands were quantified and the 
phospho-MEK signal normalized relative to lane 20 in both panels. Data 
points of two biological replicates are included along each line. The rate of 
MEK phosphorylation (Kobs; pMEK per second; far right) are represented 
in bar graphs and are derived from the linear phase of the plots in the 


left hand panels. Bars represent mean of two biological replicates; values 
for each replicate are shown as points. d, Rates of BRAF (left) and CRAF 
(right) phosphorylation of the indicated MEK complexes (KSR2-MEK1; 
KSR2(A690F)-MEK1; and free MEK). Bars represent mean of two 
biological replicates; values for each replicate are shown as points. 

e-g, APS-2-79 inhibits BRAF and CRAF phosphorylation of MEK ina 
KSR-dependent manner. Phosphorylation of 500-nM KSR2-MEK1 

(e), or KSR2(A690F)-MEK1 (f), and MEK1 (g) by BRAF (200 nM) or 
CRAF (10 nM) in the presence of 1-mM ATP and the indicated inhibitors. 
Representative western blots of phospho-MEK (as detected using a 
MEK1/2pS218/pS222 antibody) are shown. Bars represent mean of two 
biological replicates; individual data points of each replicate are shown. 
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Extended Data Figure 4 | APS-2-79 activity is not dependent on KSR 
phosphorylation sites in MEK or direct RAF inhibition. a, APS-2- 

79 does not affect BRAF(V600E)-induced MAPK activation in cells. 
BRAF(V600E)-Flag was expressed for 24 h in 293H cells. Cells were then 
treated for 2h with DMSO or 51M of either APS-2-79, APS-3-77, or 
dabrafenib before collection and western blot analysis of phosphorylated 
MEK (MEK1/2(pSer218/pSer222)) and ERK (ERK1/2(pT202/pY204)). 
b, Removal of putative KSR phosphorylation sites in MEK (MEK(AAAA); 
S18A, T23A, S24A, S72A; ref. 7) neither hinders KSR-dependent MAPK 
signalling, nor the activity of APS-2-79. Co-expression of full-length 
KSR-Flag and wild-type MEK1-GFP or MEK(AAAA)-GFP leads 

to enhanced MAPK signalling within 293H cells as visualized by 
immunoblotting for phosphorylated MEK (MEK1/2(pSer218/pSer222)) 
and ERK (ERK1/2(pT202/pY204)). APS-2-79 impedes KSR-stimulated 
MAPK signalling within cells through wild-type and MEK(AAAA) 


equally. Bars and error bars indicate pMEK and pERK intensity and 
standard deviations, respectively. Signals were normalized relative to 
lane 5. Error bars indicate the mean + s.d. (n = 3 biological replicates). 
** P < 0.0005 by two-tailed unpaired t-testing. c, The dimer-deficient 
KSR(R718H) mutant, relative to wild-type KSR, is compromised in 
MEK-inhibitor-induced feedback. 293H cells were co-transfected with 
MEK-GFP and KSR-Flag or KSR(R718H)-Flag for 24h and then treated 
with increasing concentrations of trametinib (range of 0.13 to 100 nM; 
threefold dilutions) for an additional 48 h. Cells were collected and 
analysed by western blot. d, Phospho-AMPK remains unchanged in 
HCT116 cells upon co-treatment with APS-2-79 and trametinib. HCT116 
cells were treated with APS-2-79 and/or trametinib for 48 h. Phospho- 
AMPK (top), phospho-ERK(pERK), and total MEK (bottom) western 
blots are shown. 
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Extended Data Figure 5 | Higher order assembly of the KSR2-MEK1 
complex bound to ATP or APS-2-79. a, Assembly of the KSR2-MEK1 
heterodimer bound to APS-2-79. A crystal-packing two-fold symmetry 
axis of the asymmetric unit containing a single KSR2-MEK1 complex 
produces the heterotetramer. KSR2 bound to APS-2-79 is coloured green, 
and MEK1 is coloured red. The activation segments of KSR2 and MEK] are 
coloured orange and white, respectively. The ‘induced lock (residues 809 
to 814) within KSR2 is highlighted as orange, red and blue spheres. 

b, Assembly of the KSR2-MEK1 heterodimer bound to ATP as reported 
ref. 7 (PDB code: 2Y4I). A crystallographic two-fold rotation axis produces 


the heterotetramer. r.m.s. deviation between the heterodimer and 
heterotetramers, respectively, of the ATP- and APS-2-79-bound KSR2- 
MEKI1 complexes are listed below. c, A model for APS-2-79 function 

as a KSR-targeted antagonist of MAPK signalling. APS-2-79 shifts the 
equilibrium of KSR2-MEK1 complexes so to populate the OFF state (left), 
and thereby antagonizes RAF dimerization and subsequent phosphorylation 
of KSR-bound MEK (far right). The model for RAF dimerization and MEK 
phosphorylation are adapted from ref. 7. In this model, the role of RAF. may 
be fulfilled by multiple active RAF-family kinases, such as C-RAF, bound 
within homo- or heterodimers of RAF-RAF or KSR-RAB respectively. 
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4 APS-2-79 Contact Residues That Form Pi-Stacking Interactions 


Extended Data Figure 6 | See next page for caption. 
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Extended Data Figure 6 | The APS-2-79 binding site within KSR2 and 
possible basis for KSR over RAF selectivity. a, APS-2-79 and ATP are 
overlaid in the KSR2 and MEK 1 active sites, respectively. ATP was shown 
here to emphasize the MEK active site, but ATP was not included in the 
final model. Positive (blue) and negative (red) F, — F, electron density 
maps, calculated before modelling of APS-2-79, are contoured at 3.50. 
Strong-positive-difference density within KSR2 supported modelling of 
APS-2-79 bound to KSR2 within the KSR2-MEK1 complex. b, Electron 
density map (blue mesh) for APS-2-79 (sticks) contoured at 4.50. Map 
represents positive difference density within the KSR2 active site 

before modelling of APS-2-79. c, Superposition of KSR2 (ATP- and 
APS-2-79-bound) with BRAF monomer (PDB code: 4W05) and BRAF 
dimer (PDB code: 3C4C) co-crystal structures reveals the possible bases 
for selectivity of APS-2-79 for KSR over RAF proteins. Residues within the 
APS-2-79 binding pocket that diverge between KSR and RAF proteins, but 
which are highly conserved within both sub-families are indicated with 
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arrows. Thr802 in KSR2, which is universally a Gly residue in all active 
RAF homologues, and also Phe516 and Phe793 in KSR2, which adopt 
distinct orientations from the equivalent Phe residues in RAF kinases, 
directly contact the biphenyl ether motif in APS-2-79. The T802G 
substitution, as well as the positional differences of the above-mentioned 
aromatic residues, would be predicted to reduce binding of active RAFs 
with APS-2-79. Another interaction that is probably favoured in KSR 
includes the contact mediated by the epsilon nitrogen of Arg692 with the 
-O- linker of the biphenyl motif; the placement of Arg692 is stabilized by 
Asp803 of the DFG motif. In RAF, the Arg-to-Lys substitution (Lys483 in 
subdomain II of BRAF), lacks the equivalent nitrogens to bond with both 
the -O- linker in APS-2-79 and the aspartate of the DFG motif. d, Positive 
(blue) and negative (red) F, — F, electron density map contoured at + 2.50, 
before modelling of residues 1809 to Q814 in KSR2, is shown. e, Sequence 
alignment of KSR and RAF proteins. Arrows highlight APS-2-79 contact 
residues. 
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Extended Data Figure 7 | In vitro ATP" competition assays. (mean +s.d.; n= 2 biological replicates) against ATP!” probe-labelling 


Representative western blot images of in vitro ATP*!°"" competition 
assays using recombinant KSR2-MEK1 and analogues reported in this 
study. Chemical structures are shown adjacent to assay blots. ICs9 values 


of KSR2 are listed below blots. Line graphs include data points from two 
biological replicates. 
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Extended Data Figure 8 | See next page for caption. 
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Extended Data Figure 8 | Bio-layer inferometry binding data between 
BRAF and free MEK1 or the KSR2-MEK1 complex. a, Mapping of 
residues with a r.m.s. deviation of greater than 2.0 Angstrom between 
the ATP- and APS-2-79-bound states of KSR2-MEK1 (right, blue), 
highlights alterations at contact residues Trp685 and His686 within the 
KSR-KSR homodimer (left, yellow) and KSR-RAF heterodimer (middle, 
orange) interfaces. b, Movement of Trp685-His686 within KSR2 between 
the ATP- and APS-2-79-bound states. A single protomer of KSR2 in 

the ATP-bound state (yellow), and both protomers (green and cyan) of 
the KSR2 dimer within the APS-2-79-bound state, are shown. Negative 
density around W685 and His686 in early-stage maps supported the 
conformational change in this loop between the ATP- and APS-2-79- 
bound states. c, The negative control compound APS-3-77 (251M) does 
not impact assembly of BRAF(F667E) and KSR2-MEK1. These assays 
were performed identically to the experiments in Fig. 3b-g. Coloured 


curves indicate dose ranges of KSR2-MEK1 or MEK] from 625 nM to 
101M in the presence or absence of the indicated compounds. In all plots, 
association occurred from 0 to 660s, and dissociation was monitored 
thereafter up to 1500s. d-e, Biolayer inferometry of wild-type BRAF with 
MEK1 and KSR2-MEK1 in the presence of DMSO and 251M APS-2-79. 
These assays were performed identically to the experiments in Fig. 3b-g. 
Coloured curves indicate dose ranges of KSR2-MEK1 or MEK] from 
625nM to 10M in the presence or absence of the indicated compounds. 
In all plots, association occurred from 0 to 660s, and dissociation was 
monitored thereafter up to 1500s. f, Table summary of BLI data in this 
figure and Fig. 3b-h. Ky, Kon, and Kos values represent the mean and s.e.m. 
measurements derived from global fitting of 5 binding curves. \? and R? 
describe experimental and model data correlations; <3 and above 0.95, 
respectively, indicate good fits. 
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Extended Data Figure 9 | See next page for caption. 
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Extended Data Figure 9 | KSRi binder APS-2-79 synergizes with 
trametinib in Ras-mutant cells. a, Average Bliss score of the combination 
of trametinib, binimetinib, PD0325901, or AZD6244 with APS-2-79 in 
the Ras-mutant cell lines HCT116 and A549 versus the RAF-mutant cell 
lines A375 and SK-MEL-239. Full combination matrices of APS-2-79 
(range: 100nM to 3M in threefold dilutions) with trametinib (range: 
0.01-100 nM in threefold dilutions), binimetinib (range: 0.1-10,.M in 
threefold dilutions), PD0325901 (range: 0.1-10 1M in threefold dilutions), 
and AZD6244 (range: 0.1-10,1M in threefold dilutions). Bars represent 
the mean Bliss scores calculated from two biological replicates of the 
depicted concentration matrices; points represent each calculated score. 
b, Average Bliss scores of APS-2-79 or APS-3-77 in combination with 
trametinib in RAF-mutant, RAS-mutant cell lines. SK-MEL-2 and HepG2 


are N-Ras-mutant cell lines, and MEWO is a NF1-mutant cell line. Bars 
represent the mean Bliss scores calculated from two biological replicates 
of the depicted concentration matrices; points represent each calculated 
score. c, Complete cell viability analysis of APS-2-79 (range: 100—3,000 nM 
in threefold dilutions) plus trametinib (range: 0.01-100 nM in threefold 
dilution) over a full concentration matrix in the Ras-mutant LOVO, 
CALU-6, SW620, SK-MEL-2, and HEPG2 cell lines, the RAF-mutant 
COLO-205, H2087, and SW1417 cells, and the NFl-mutant MEWO 

cell line. Numbers listed within synergy matrices represent percentage 
of growth inhibition relative to DMSO control and are the mean of two 
biological replicates. Bliss scores represent the mean calculated from two 
biological replicates of the depicted concentration matrices. 
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Extended Data Figure 10 | APS-2-79 synergizes with trametinib 
specifically in Ras-mutant cells compared to the HER-family and 
SRC-family inhibitors lapatinib and sarcatinib. a, Chemical structures 
of APS-2-79 and quinazoline-containing kinase inhibitors sarcatinib and 
lapatinib. The primary targets for sarcatinib and lapatinib are c-Src and 
Her2, respectively”. ICso values against ATP"*” probe-labelling of KSR2 
are listed below structures. b, Bliss score analysis of HCT-116, A549, A375, 
and SK-MEL-239 cells treated with APS-2-79, sarcatinib, or lapatinib 
(range: 100-3,000 in threefold dilutions) in combination with trametinib 
(range: 0.01-100 in threefold dilution). Bars represent the mean Bliss 
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scores calculated from two biological replicates; points represent 

each calculated score. c, Absolute Bliss score of the indicated drugs in 
combination with trametinib in Ras-mutant relative to RAF-mutant 

cell lines demonstrates selective synergy in Ras-mutant cell lines for 
APS-2-79 compared to sarcatinib and lapatinib. d, log of the combination 
index graphs of APS-2-79 in combination with trametinib in HCT-116 
versus SK-MEL-239 cells as compared to the fractional effect. Negative 
combination index over a broad fractional effect range within HCT-116, 
but not SK-MEL-239, indicates strong synergy. 
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Structural basis for inhibition of a voltage-gated 
Ca?+ channel by Ca** antagonist drugs 


Lin Tang!’, Tamer M. Gamal El-Din!, Teresa M. Swanson!, David C. Pryde’, Todd Scheuer!, Ning Zheng!s & 


William A. Catterall's 


Ca”* antagonist drugs are widely used in therapy of cardiovascular 
disorders'*. Three chemical classes of drugs bind to three 
separate, but allosterically interacting, receptor sites on Cay1.2 
channels, the most prominent voltage-gated Ca” (Cay) channel 
type in myocytes in cardiac and vascular smooth muscle*’. 
The 1,4-dihydropyridines are used primarily for treatment of 
hypertension and angina pectoris and are thought to act as allosteric 
modulators of voltage-dependent Ca” channel activation, whereas 
phenylalkylamines and benzothiazepines are used primarily for 
treatment of cardiac arrhythmias and are thought to physically 
block the pore!”. The structural basis for the different binding, 
action, and therapeutic uses of these drugs remains unknown. 
Here we present crystallographic and functional analyses of drug 
binding to the bacterial homotetrameric model Cay channel CayAb, 
which is inhibited by dihydropyridines and phenylalkylamines 
with nanomolar affinity in a state-dependent manner. The binding 
site for amlodipine and other dihydropyridines is located on the 
external, lipid-facing surface of the pore module, positioned at the 
interface of two subunits. Dihydropyridine binding allosterically 
induces an asymmetric conformation of the selectivity filter, in 
which partially dehydrated Ca’* interacts directly with one subunit 
and blocks the pore. In contrast, the phenylalkylamine Br-verapamil 
binds in the central cavity of the pore on the intracellular side of the 
selectivity filter, physically blocking the ion-conducting pathway. 
Structure-based mutations of key amino-acid residues confirm 
drug binding at both sites. Our results define the structural basis for 
binding of dihydropyridines and phenylalkylamines at their distinct 
receptor sites on Cay channels and offer key insights into their 
fundamental mechanisms of action and differential therapeutic 
uses in cardiovascular diseases. 

Cay1 channels are composed of a complex of a pore-forming al 
subunit associated with (3, ¥, and «26 subunits!!°. The «1 subunits 
contain four homologous domains with six transmembrane segments 
in each’) !*. Transmembrane segments $1-S4 form the voltage-sens- 
ing module, and S5, S6 and the intervening P-loop form the porel. 
The overall architecture of the mammalian skeletal muscle Cay1.1 
channel was recently elucidated at a resolution of ~4-6 A by cryo- 
electron microscopy’. However, higher-resolution structural analysis 
of mammalian Cay channels has not yet been achieved. The bacterial 
voltage-gated Nat channel NaChBac and its relatives are homote- 
trameric proteins composed of four identical subunits, each analogous 
to one domain of a mammalian voltage-gated Na‘ or Ca** channel'*>. 
These bacterial channels probably represent the evolutionary ancestors 
of both mammalian channel families. The structures of bacterial Nat 
channels have been determined at high resolution by X-ray crystallog- 
raphy in pre-open'® and inactivated'”"® states. Moreover, the structural 
basis for Ca”* conductance and selectivity has been elucidated at atomic 
resolution through studies of CayAb, a site-directed mutant of NayAb 
with full Ca* channel function’. We have used derivatives of CayAb 


(see Methods) to define receptor sites and mechanisms of action of 
Ca** antagonist drugs at atomic resolution. 

CayAb was inhibited by amlodipine with high affinity (Fig. la-c). 
No inhibition was observed during single depolarizations, indicating 
that amlodipine does not enter the open pore and block it (Fig. 1a). 
However, inhibition increased progressively during trains of depolar- 
izations, reflecting increased binding affinity for the activated and/or 
inactivated states of CayAb (Fig. 1b). After a train of 20 depolarizing 
pulses, the half-maximum inhibitory concentration (ICs9) for inhibi- 
tion by amlodipine was 10nM (Fig. 1c). This affinity was surprisingly 
high, considering the evolutionary distance between CayAb and mam- 
malian Cay1.2 channels, which have ICs values from 0.3nM to 144M 
for various dihydropyridines”. 

Photoaffinity labelling and site-directed mutagenesis suggest that 
dihydropyridines bind to a receptor site at the interface of homolo- 
gous domains III and IV and the adjacent pore module in domain 
III in Cay1.2 channels*-”7!. In CayAb, four identical subunits form 
a homotetramer (Fig. 1d)'°. The structure of the amlodipine-CayAb 
complex reveals the antagonist bound on the outer, lipid-facing surface 
of the pore module in the intersubunit crevice formed by neighbouring 
tilted S6 helices and the P-helix of the selectivity filter (Fig. 1d, e, 
yellow sticks). Despite the homotetrameric structure of CayAb, only 
a single drug-binding site per tetramer is occupied, suggesting that 
drug-induced conformational changes prevent occupancy of more 
than one site. Amino-acid residues Y195, 1199, F171, Y168 and F167 
form a hydrophobic pocket for interaction with amlodipine (Fig. 1f). 
The dihydropyridine ring is sandwiched between Y195 of S6 and 
F167 of the P-loop. F171 and 1199 of S6 form the bottom of the cleft 
that accommodates the bound drug. Mutations of 1199 (for example, 
1199S) had minimal effects on CayAb function (Extended Data Fig. 1), 
but markedly reduced the affinity for amlodipine (ICs) = 112 nM; 
Fig. Ic). 

Nimodipine inhibited CayAb like amlodipine, but its ICs) was 
100 nM (Fig. 2a—c). Nimodipine binds to the same site as amlodip- 
ine (Fig. 2d, e and Extended Data Fig. 2a, b). The substitution 1199S 
increased the ICs9 for nimodipine from 100 nM to 5.7 .M (Fig. 2c), 
and W195Y increased it to 508 nM (Extended Data Fig. 3). The exper- 
imental Br-dihydropyridine derivative UK-59811 inhibited CayAb 
with ICs) = 194nM (Extended Data Fig. 4) and bound in a similar 
position (Fig. 2f and Extended Data Fig. 2a, c). Anomalous scatter- 
ing density from its Br atom further confirmed the location of the 
dihydropyridine-binding site at the interface between the S6 segments 
of two adjacent subunits surrounded by Y195, F171, F167, and 1199 
(Fig. 2f, green mesh). High-resolution structures of CayAb revealed 
16 molecules of bound lipid per tetramer'®. Without drugs, we found 
a single molecule of DMPC lipid aligned in the dihydropyridine-bind- 
ing pocket with its polar headgroup facing the extracellular side and 
its long hydrocarbon tails projecting deep into the crevice formed by 
neighbouring S6 helices (Fig. 2g and Extended Data Fig. 2d). Thus, our 
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Figure 1 | Structural basis for inhibition of CayAb by amlodipine. 

a, Amlodipine structure. Ba”* currents for 0nM (black) and 10nM (red) 
amlodipine during depolarization from —120 mV to 0 mV. b, State- 
dependent block by amlodipine after 50-ms pulses at 1 Hz from —120mV 
to 0OmV (10 nM, circles; 100 nM, triangles; mean + s.e.m.; 1 = 3-5. ¢, 
Inhibition by amlodipine. Data were fit by the Hill equation with ny=1. 
CayAb: ICs) = 10 +0.4nM; CayAb 1199S: ICs9 = 112 + 10nM; n= 3-5; 
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mean + s.e.m. d, Structure of CayAb (top view in cylinders) 

binding amlodipine (yellow sticks). PM, pore module; VSD, voltage- 
sensing domain. e, CayAb with bound amlodipine in side view. 

f, Dihydropyridine-binding pocket of CayAb with the F.-F, electron 
density map (2.50, cyan) and amlodipine (yellow sticks). CayAb residues 
contacted by amlodipine are highlighted in colours and labelled. 
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Figure 2 | Inhibition of CayAb by dihydropyridine binding at a lipid 
site. a, Nimodipine structure. Current records as in Fig. 1a. b, State- 
dependent block by nimodipine as in Fig. 1b: 5nM (black), 25nM (brown), 
100nM (green), 1 {1M (red), and 5\1M (blue); mean + s.e.m.; m= 3-14. 

c, Inhibition by nimodipine as in Fig. 1c. CayAb: ICs) = 100 + 9nM; CayAb 


structures reveal that dihydropyridine binding displaces an endogenous 
lipid molecule from their common binding site on CayAb. 

In the absence of dihydropyridines, the CayAb structure has fourfold 
symmetry around the pore axis!”. Four lipid molecules are found in the 
central cavity, occupying fenestrations that connect to the exterior of 
the channel (Fig. 3a). Binding of dihydropyridines to CayAb rearranges 
the quaternary structure and breaks the fourfold symmetry (Fig. 3c, e, g, 
compare shaded cross-sections; see Supplementary Discussion of 
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1199S, ICs9 = 5.7 + 0.6 1M; n= 3-14; mean +s.e.m. d, Amlodipine (yellow 
sticks) bound to CayAb. $5 and S6 helices in ribbons; residues surrounding 
amlodipine in sticks. e, Nimodipine bound to CayAb. f, UK-59811 bound to 
CayAb. Anomalous scattering density (30, green mesh) for Br in UK-59811. 
g, DMPC lipid in the drug-free dihydropyridine-binding site in yellow sticks. 


asymmetry induced by drug binding). With drug bound, the four lipid 
molecules in the central cavity lose their symmetric spatial organiza- 
tion, and the fenestration closest to the drug-binding site is no longer 
occupied by a lipid chain. 

By introducing asymmetry, dihydropyridine binding triggers 
allosteric changes at the selectivity filter of CayAb and alters binding 
of the substrate ion. There are three Ca**-binding sites in the CayAb 
selectivity filter: two high-affinity sites (Sites 1 and 2) followed by one 
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Figure 3 | Dihydropyridine binding allosterically modifies Ca** 
binding in the selectivity filter. a, Outward view. Four symmetrical lipids 
(red sticks) occupy fenestrations in CayAb without dihydropyridine. 
Four additional lipids bind to the side of the pore module (yellow sticks). 
b, Top view. Site 1 with hydrated Ca** (green) coordinated directly by 
D178 and indirectly by N181 on extracellular end of the selectivity filter. 
c, Amlodipine binding (magenta sticks) induces asymmetry and causes 
rearrangement of lipids (red sticks). d, Top view. Site 1 with partially 
dehydrated Ca** and direct interaction with D178 due to binding of 
amlodipine. e, Binding of nimodipine (cyan sticks) induces asymmetry 
and reorganizes bound lipid. f, Partially dehydrated Ca?* binds at site 1 
with coordination distance of 3.2 A to carboxylate side chains of D178. 

g, Binding of UK-59811 (blue sticks) to the dihydropyridine binding site 
induces asymmetry and reorganizes bound lipid. h, Ca** binds at Site 1 
with coordination distance of 2.8 A to a carboxylate side chain of D178. 


lower-affinity site (Site 3) arranged sequentially from its extracellular 
to intracellular end'®. Without drug, Ca*~ binds near the central axis 
of the pore in a fully hydrated state, coordinated symmetrically by 
four D178 carboxylate side chains (Fig. 3b)'®. With dihydropyridines 
bound, Ca’* binds to Site 1 asymmetrically in a partially dehydrated 
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state—significantly off the central axis of the pore and closer to one or 
two D178 carboxylate groups at a distance of 2.8-3.3 A (Fig. 3d, f, h and 
Extended Data Fig. 5a—d). This binding distance suggests direct inter- 
action of bound Ca?* with the carboxylate side chain (Supplementary 
Discussion). In contrast, binding of Ca** at Site 2 is unchanged 
(data not shown). The anomalous scattering density of Ca”* con- 
firms its off-axis location in Site 1 and on-axis location in Site 2 
(Extended Data Fig. 5e, f). 

Studies with quaternary phenylalkylamine analogues revealed that 
these drugs inhibit Cay1.2 channels only after cytoplasmic application, 
and that drug binding is increased by repetitive depolarization to open 
the pore*”?. It was therefore concluded that tertiary phenylalkylamines 
such as verapamil penetrate the membrane in uncharged form, are 
re-protonated in the cytosol, and block the Cay1.2 channel by entering 
the intracellular mouth of the open pore in their protonated form and 
binding to their receptor site*?*. Photoaffinity labelling and site-di- 
rected mutagenesis revealed that the phenylalkylamine receptor site 
is formed by S6 segments in domains II and IV of Cay1.2 channels, 
consistent with drug binding in the pore*”*"S, 

When Br-verapamil was perfused at —120 mV, the first depolari- 
zation to OmV showed progressive reduction of the current during 
the pulse (Fig. 4a). This profile supports a pore-blocking mechanism, 
in which the drug progressively enters and blocks the open pore. 
Repetitive depolarizing stimuli increased inhibition of CayAb by 
Br-verapamil (Fig. 4b), yielding ICs values of 810 nM for Br-verapamil 
(Fig. 4c, blue squares) and 475 nM for verapamil (Extended Data 
Fig. 6a, b) at steady state. The action of these drugs is strikingly state- 
dependent: the ICs 9 for Br-verapamil in the resting state is 241M, 
30-fold higher than observed after a train of depolarizing stimuli 
(Fig. 4c, blue circles). 

Our crystal structures revealed a single molecule of Br-verapamil 
bound in the central cavity on the intracellular side of the ion selectivity 
filter (Fig. 4d, e; see Supplementary Discussion of asymmetry induced 
by drug binding). The bound drug is oriented with its characteristic 
positively charged tertiary amino group facing in the extracellular 
direction pointing towards Site 3 in the selectivity filter. In this posi- 
tion, the bound phenylalkylamine would physically block the pore. The 
distance between the tertiary amino group and Ca”* coordinated by the 
carbonyls of L176 is 5 A. The methoxy groups in the aromatic rings are 
located close to the inner end of the fenestrations, surrounded by T206, 
M209 of the neighbouring subunit and T175, M174, L176 of the selec- 
tivity filter (Fig. 4f). The two aromatic rings of Br-verapamil interact 
with T206 residues from two neighbouring S6 helices (Fig. 4f). A view 
from the intracellular side shows that Br-verapamil binds closer to two 
subunits on one side of the pore (Fig. 4f). The anomalous scattering 
from Br-verapamil further defines the position of the aromatic ring 
that is farther from the amino group and confirms its interaction with 
T206 (Fig. 4e, green mesh). Mutations in T206 impair inactivation of 
CayAb (Extended Data Fig. 6c, e) and markedly reduce the affinity for 
Br-verapamil. For example, the conservative mutation T206S increases 
the ICs for state-dependent inhibition from 810 nM to 24M (Fig. 4c, 
red squares) and the ICs for resting state inhibition of CayAb from 
241M to 115M (Fig. 4c, red circles). The effects of these mutations 
on both resting and state-dependent block confirm that there is a direct 
interaction between the drug and T206. These results define the recep- 
tor site for pore block by phenylalkylamines at high resolution. Similar 
to the dihydropyridine-binding site, the phenylalkylamine-binding site 
is also occupied by lipid molecules in the absence of the drug. 

At concentrations above 1 1M, dihydropyridines inhibit voltage- 
gated Nat channels in a manner consistent with pore block”®. At 
the high drug concentrations used in our crystallization studies, we 
found binding of UK-59811 (Fig. 4g-i) and other dihydropyridines in 
the pore of CayAb. The anomalous scattering density of its Br places 
the dihydropyridine ring deep in the central cavity where it forms 
hydrophobic contacts with two neighbouring subunits (Fig. 4h, green 
mesh). Compared to Br-verapamil, UK-59811 bound more towards 
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Figure 4 | State-dependent inhibition by pore block with Br-verapamil 
and UK-59811. a, Br-verapamil. Ba”* current records for CayAb with 01.M 
(black) and 101M (red) during the depolarizing pulse. b, State-dependent 
block of CayAb (n =7) and CayAb T2068 (n= 3) at 101M during trains 
of depolarizations at 1 Hz from —120 mV to 0 mV. The error bars for 

all the data points on this graph are too small to be visible. c, Inhibition 
by Br-verapamil for CayAb and CayAb T206S at V= —120 mV and 
following trains of depolarizations as in b. CayAb: resting state block, 
blue circles, ICs) = 24 + 1.6 1M; state-dependent block, blue squares, 

ICs9 = 810 + 80 nM. CayAb T2065: resting state block, red circles, 


the intracellular base of the central cavity and was located closer to one 
subunit (Fig. 4g-i). Low-affinity block of Na* channels by dihydropyri- 
dines bound in this site may contribute to cardiac arrhythmias caused 
by toxic overdoses of these drugs. 

Overall, our results provide a structural basis for understanding 
how dihydropyridines and phenylalkylamines bind at two distinct, 
but allosterically coupled receptor sites on Cay1.2 channels and have 
different efficacy for treatment of hypertension and angina pectoris 
versus cardiac arrhythmias!~*-”. Consistent with photoaffinity- 
labelling and site-directed mutagenesis*’, dihydropyridines bind 
on the outer, lipid-facing surface of the pore module at the interface 
between two subunits of CayAb, in analogy with their proposed site 
of action between domains III and IV of Cay1.2 channels*?!. Their 
binding site is exposed to the extracellular side of the membrane, 
but not to the intracellular side. These structural results reveal why 
charged dihydropyridines are ineffective when applied intracellu- 
larly’’, and they are consistent with location of the drug-binding site 
~11-14A from the outer surface of the lipid bilayer as inferred from 
studies of charged derivatives of amlodipine with hydrophobic link- 
ers of increasing length*®. These comparisons reveal a close analogy 
between the site of dihydropyridine binding in our crystal structures 
of CayAb and the expectations from studies of Cay1.2 channels, but 
the exact position of the drug-binding site in CayAb is approximately 
one helical turn towards the extracellular side from the amino-acid 
residues implicated in dihydropyridine binding by studies of Cay1.2 
channels (Extended Data Fig. 7). This difference may reflect the great 
evolutionary distance between CayAb and mammalian Cay chan- 
nels and/or indirect allosteric effects of mutations studied in Cay1 
channels. 
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ICsp = 115 +3.2 1M; state-dependent block, red squares, ICs) = 24+ 0.8 .M; 
n=3-11;mean+s.e.m. d, Side view of the pore module sectioned 
through the selectivity filter with Br-verapamil bound (yellow sticks). 
Ca’*, green spheres. e, F,-F, electron density (2.50, orange mesh) and 
anomalous scattering density (30, green mesh) for Br defines location of 
Br-verapamil. f, The two aromatic rings of verapamil are close to T206 of 
adjacent subunits. g, UK-59811 (red sticks) binds with its dihydropyridine 
ring deep in the central cavity. h, Anomalous scattering density (3.50, 
green mesh) of Br in UK-59811. i, S6 segments with residues surrounding 
UK-59811 in sticks. 


Binding of a single dihydropyridine to CayAb induces a confor- 
mational change that alters the fourfold symmetry of the quaternary 
structure and induces changes in the three unoccupied dihydropyridine- 
binding sites that may prevent drug occupancy (Extended Data Fig. 8). 
Drug binding also disrupts the symmetry of the ion selectivity filter, 
allowing direct coordination of Ca?* by carboxylate side chains. This 
conformational change is mediated in part by an altered pattern of 
hydrogen bonds formed by N181 in the subunit binding the dihy- 
dropyridine (Fig. 3). These structural results correlate closely with 
ligand-binding studies of Cay1.2 channels, which suggested that 
dihydropyridines induce high-affinity Ca”* binding and block of 
the pore*”*°. Our structural studies reveal exactly how dihydropyri- 
dines act as indirect allosteric blockers of the pore of Ca”* channels. 
Dihydropyridine binding to Cay1.2 channels is voltage-dependent 
because of the high affinity for the inactivated state!*-’. Ina remarkable 
parallel, dihydropyridine binding causes a conformational change to an 
asymmetric pore structure in CayAb, which is similar to the asymmetry 
induced in inactivated states of the parent NayAb channel” and its rela- 
tive NayRh'®. Dihydropyridine binding may induce a similar asymmet- 
ric, Ca?*-blocked state of Cay1.2 channels and thereby enhance their 
inactivation, allowing selective inhibition in persistently depolarized 
cells. This mechanism underlies the use of dihydropyridines in treat- 
ment of hypertension and angina pectoris, in which vascular smooth 
muscle cells of resistance vessels are persistently depolarized, and their 
Cay1.2 channels are selectively inhibited by dihydropyridines. 

The phenylalkylamine receptor site was localized to the S6 segments 
in domains III and IV of Cay1.2 channels by photoaffinity labelling 
and mutational analysis, and it was proposed that the amino-acid 
side chains involved in drug binding point towards the lumen of the 
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pore*®**5, Our structural results correlate precisely with this expec- 
tation and reveal the exact structure of the drug-receptor complex. 
Br-verapamil is stretched between two subunits of CayAb, consistent 
with drug binding at the interface of domains III and IV in Cay1.2 
channels*°?*”°. As for dihydropyridines, phenylalkylamine bind- 
ing at this site disrupts the fourfold symmetry of the pore (Extended 
Data Fig. 9). Location of the phenylalkylamine receptor site deep in 
the central cavity in the pore reveals why binding of these drugs is 
state-dependent. Access of phenylalkylamines to their receptor is 
greatly enhanced by opening the intracellular activation gate, which 
allows diffusion to the drug receptor site. Drug binding is therefore 
frequency-dependent, allowing selective block of Cay1.2 channels in 
rapidly firing cardiac myocytes”. This mechanism is the basis for use 
of verapamil for cardiac arrhythmias. 

Overall, our structural studies illuminate the complex pharmacology 
and therapeutic uses of Ca”* antagonist drugs in treatment of differ- 
ent cardiovascular disorders at the atomic level (see Supplementary 
Discussion). These structural models will be important for design and 
development of next-generation Ca”* antagonist drugs to provide safer 
and more effective treatment of hypertension, angina pectoris, cardiac 
arrhythmia, and other medical conditions. 


Online Content Methods, along with any additional Extended Data display items and 
Source Data, are available in the online version of the paper; references unique to 
these sections appear only in the online paper. 
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METHODS 

CayAb constructs and drugs. As originally defined, CayAb was constructed by 
introducing the mutations E177D, $178D and M181D as a triple mutant into 
NayAb?’. This construct was used for all electrophysiological studies, except as 
noted in figure legends. In this work, we have also used CayAb E177D S178D 
M181N, which has an identical structure and Ca” -binding properties and has high 
Ca?" selectivity””. It gives greater consistency of high-resolution crystal structures. 
We have also added the mutation W195Y, which substitutes the Y residue from 
the analogous position in mammalian Cay1.1 channels for W195 in CayAb. This 
mutant gives better resolution of drugs bound at the dihydropyridine site. CayAb 
E177D S178D M181N W195Y was used for all structural studies presented here, 
except as noted in the figure legends. Similar structural results were obtained for 
both versions of CayAb. We found that amlodipine and other dihydropyridines 
(Figs. 1 and 2 and Extended Data Figs 1, 3 and 4) and verapamil and other pheny- 
lalkylamines (Fig. 4 and Extended Data Fig. 6) effectively blocked CayAb and gave 
high-resolution crystal structures; however, we were unable to prepare crystals with 
bound diltiazem for structural biology so we have not addressed the structure of 
the benzothiazepine receptor site in this work. 

Electrophysiology. All measurements were done in insect cells (Trichoplusia ni 
cells; High5). All CayAb constructs used were made on the background of N49K 
mutation. Mutation N49K shifts the activation curve ~75 mV to more positive 
potentials compared to wild-type CayAb and abolishes the use-dependent inactiva- 
tion as described previously'**. All constructs showed good expression, allowing 
measurement of ionic currents 24—48h after infection. Whole-cell Ba’* currents 
were recorded using an Axopatch 200 amplifier (Molecular Devices) with glass 
micropipettes (2-4 M{Q)). Capacitance was subtracted and 80-90% of series resist- 
ance was compensated using internal amplifier circuitry. Extracellular solution 
contained in (mM) 10 BaCl,, 140 NMVDG-methanesulphonate, 20 HEPES, (pH 7.4, 
adjusted with Ba(OH)», [Ba*] otal = 13 mM). Intracellular solution contained in 
(mM) 105 CsF, 35 NaCl, 10 HEPES, 10 EGTA, (pH 7.4, adjusted with CsOH). 
Current-voltage (I-V) relationships were recorded in response to steps to voltages 
ranging from —120 to +50mV in 10-mV increments from a holding potential of 
—120 mV. Conductance-voltage (G-V) curves were calculated from the corre- 
sponding (J-V) curves. Pulses were generated and currents were recorded using 
Pulse software controlling an Instrutech ITC18 interface (HEKA). Data were ana- 
lysed using Igor Pro 6.2 (WaveMetrics). Sample sizes were chosen to give s.e.m. 
values of less than 10% of peak values based on prior experimental experience. 
Inhibition curves were fit with a Hill equation with ny = 1.0 unless indicated oth- 
erwise in the figure legends. 

Protein expression and purification. The pFastBac-Flag-CayAb was used as the 
construct for producing homotetrameric model voltage-gated Ca?* channel”. 
1199S, W195Y, and T206S constructs were generated via site-directed mutagen- 
esis using QuickChange (Stratagene). Recombinant baculovirus were produced 
using the Bac-to-Bac system (Invitrogen), and T. ni insect cells were infected for 
large-scale protein purification. Cells were harvested 72h post-infection and re- 
suspended in buffer A (50mM Tris-HCl, pH = 8.0, 200 mM NaCl) supplemented 
with protease inhibitors and DNase. After sonication, digitonin (EMD Biosciences) 
was added to 1%, and solubilization was carried out for 1-2h at 4°C. Clarified 
supernatant was then incubated with anti-Flag M2-agarose resin (Sigma) for 1-2h 
at 4°C with gentle mixing. Flag-resin was washed with ten column volumes of 
buffer B (buffer A supplemented with 0.12% digitonin) and eluted with buffer B 
supplemented with 0.1 mg ml"! Flag peptide. The eluant was concentrated and then 
passed over a Superdex 200 column (GE Healthcare) in 10mM Tris-HCl pH =8.0, 
100mM NaCl and 0.12% digitonin. The peak fractions were concentrated using a 
Vivaspin 30K centrifugal device. 

Crystallization and data collection. CayAb and the W195Y mutant were concen- 
trated to ~20mg ml! and reconstituted into DMPC:CHAPSO (Anatrace) bicelles 


according to standard protocols'*!”**3, The protein-bicelle preparation and a well 
solution containing 1.8-2.0 M ammonium sulphate, 100 mM Na-citrate pH =5.0 
was mixed in 1:1 ratio and set up in a hanging-drop vapour-diffusion format. 
All the antagonist complex crystals were obtained through co-crystallization by 
incubating the protein-bicelle with 100|1M antagonist overnight before setting 
up crystallization trials. For the UK-59811 complex, both 100|1M and 200 1M 
antagonist were used for UK-59811/CayAb crystals. Crystals were cryoprotected 
by soaking in 0.1 M Na-acetate, pH 5.0, 26% glucose, 2.0 M ammonium sulphate, 
and 5mM Ca**. Crystals were plunged into liquid nitrogen and maintained at 
100 K during all data collection procedures. 

The anomalous diffraction data sets for Br were collected at 0.9194 A, and the 

anomalous data sets for Ca”* were collected at 1.75 A with the same synchrotron 
radiation source (Advanced Light Source, BL8.2.1). To optimize the anomalous 
scattering signal, the data sets were collected by using the ‘inverse beam strategy’ 
with the wedge size of 5°. 
Structure determination, refinement, and analysis. X-ray diffraction data were 
integrated and scaled with the HKL2000 package™ and further processed with 
the CCP4 package**. The structure of CayAb and its antagonist complex were 
solved by molecular replacement using an individual subunit of the CayAb struc- 
ture (PDB code 4MS2) as the search template. The data sets were processed in 
P21221 space group, in which there are four molecules in one asymmetric unit. 
Crystallography and NMR System software** were used for refinement of coordi- 
nates and B-factors. Final models were obtained after several cycles of refinement 
with REFMAC* and PHENIX* plus manual re-building using COOT*’. The 
geometries of the final structural models of CayAb and its antagonist complexes 
were verified using PROCHECK™. Divalent cations were identified by anoma- 
lous difference Fourier maps calculated using data collected at wavelengths of 
1.75 A for Ca?*. The Br atoms of UK-59811 and Br-verapamil were identified by 
anomalous difference Fourier maps calculated using data collected at wavelengths 
of 0.9194 A. Procedures accounting for merohedral twinning were performed 
during structural refinement of amlodipine, nimodipine, and Br-verapamil data 
sets. Detailed crystallographic data and refinement statistics for all constructs 
are shown in Extended Data Table 1. All structural figures were prepared with 
PyMol"'. 
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Extended Data Figure 1 | Biophysical characterization of CayAb 1199S. 

a, Ba** currents recorded from a holding potential of —120 mV to test 
potentials from —60 mV to 20mV in 10 mV steps for 1199S. b, G-V curves of 
CayAb and CayAb 1199S derived from peak I-V relationships. The voltages 
for half-maximal activation and slopes are: CayAb: Vi). = —18.8+0.3, 
k=3.68 £0.43, n=7; CayAb 1199S: Vj;)=—18.8+0.3, k= 3.88 £0.47 (n=5). 
c, Repetitive depolarization to 0 mV at 1 Hz from a holding potential of 
—120mV (n=5). d, Steady-state inactivation of CayAb and CayAb 1199S. 
Two pulses were applied: a 300-ms conditioning pulse to the indicated 
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potentials followed by 50-ms test pulse to OmV (n=3). e, State-dependent 
block of CayAb 1199S by 10nM (green), 100nM (blue), or 1.5,1.M (red) 
amlodipine during repetitive depolarizations to 0 mV (left, n =3-5 cells). Ba** 
currents in 100nM amlodipine for CayAb 1199S (right). f, Concentration- 
dependent block of CayAb 1199S by nimodipine at 100 nM (blue), 14M (red), 
5M (brown), 10|M (grey) and 50M (black) (left, n = 4-5 cells for each 


curve). Ba” currents in the presence of 511M nimodipine for CayAb 1199S 
(right). 
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Extended Data Figure 2 | Structural comparison of the binding modes shown in sticks. b, An Fo—Fc simulated annealing omit map contoured 
of amlodipine, nimodipine, and UK-59811. a, Superposition of CayAbin —_ at 2.50 for nimodipine. c, An Fo-Fc simulated annealing omit map 
complexes with amlodipine (cyan), nimodipine (yellow), and UK-59811 contoured at 2.50 for UK-59811. d, An Fo-Fc simulated annealing omit 


(magenta) at the dihydropyridine binding site viewed from the side of the map contoured at 2.50 for DMPC. 
pore module. The side chains of dihydropyridine-interacting residues are 
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Extended Data Figure 3 | Biophysical characterization and drug block Vi2=—9£0.3,k=7.44 
of CayAb W195 and CayAb Y195. a, Sequence alignment of CayAb S6 
segment and Cay1.1 DIV S6. W195 in CayAb is equivalent to Y1358 in 
Cay 1.1. b, Ba?* currents recorded from a holding potential of —120 mV to 


test potentials from —60 mV to 20mV in 10 mV steps for CayAb W195Y. (white), 500 nM (blue), 1 uM (green), 541M (red), and control (grey). 
c, G-V curves for CayAb W195 and CayAb Y195 derived from peak f, Concentration-dependent block of CayAb W195Y by nimodipine. 
I-V relationships. The voltages for half-maximal activation and slopes ICs9 = 508 +93 nM (n=4-5 cells for each point). 

are: CayAb W195 Vj/2 = —18.8 + 0.3, k= 3.7 £0.43, n =7; CayAb Y195, 


t 0.1,n=5. d, Steady-state inactivation 

of CayAb W195 and CayAb Y195 (n = 3). Two pulses were applied: 
a 300-ms conditioning pulse followed by 50-ms test pulse to 0 mV. 

e, State-dependent block of CayAb W195Y by nimodipine at 100 nM 


© 2016 Macmillan Publishers Limited, part of Springer Nature. All rights reserved. 


LETTER 


a 
Control 100 nM UK 500 nM UK 
0.2 n”AL_ 
10 ms 
5000 nM UK-5981 1 
rs 
b C 


a 
E 0.5 ’ E 0.5 
fe) 2 | 
Zz Zz 
0.0 had 0.0 
1 2 3 
0 5 10 15 20 10 10 10 
Number of pulses UK-59811(nM) 

Extended Data Figure 4 | Biophysical characterization of block by and 51M (brown). For each curve, n = 4-5 cells. c, Concentration- 
UK-59811. a, Ba** currents for state-dependent block by different response curve for UK-59811. Data were fit with a Hill equation assuming 
concentrations of UK-59811. b, State-dependent block of CayAb by a 1:1 binding. ICs) = 194+22nM, n=4-5. 
UK-59811 at 0nM (black), 100 nM (green), 500 nM (red), 11M (blue), 
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Extended Data Figure 5 | Evidence for the partially dehydrated annealing omit map contoured at 3o for residues 178 and 181 for CayAb- 
Ca?* binding and carboxyl-carboxylate pairs at the selectivity filter UK-59811. e, Top view of Site 1 with the anomalous difference Fourier 
entryway. a, Top view of an Fo-Fc simulated annealing omit map map density (red mesh, contoured at 3c) calculated with diffraction data 


contoured at 30 for residues 178 and 181 for the wild-type channel without _of crystals collected at 1.75 A wavelength. Ca?* is shown as a green sphere. 
drug. b, Top view of an Fo-Fc simulated annealing omit map contoured at _Site 1 residues are shown in sticks. Hydrogen bonds are indicated with 

30 for residues 178 and 181 for CayAb-amlodipine. c, Top view of an dashed lines. f, Top view of Site 2 with the anomalous difference Fourier 
Fo-Fc simulated annealing omit map contoured at 2.50 for residues map density (magenta mesh, contoured at 30). 

178 and 181 for CayAb-nimodipine. d, Top view of an Fo-Fc simulated 
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Extended Data Figure 6 | Biophysical characterization of verapamil d, G-V curves. CayAb (black): V1/2= 18.8 +0.3 mV, k = 3.7 £0.43 (n=5); 
block of CayAb and functional properties of CayAb T206S. a, Chemical CayAb T206S (blue): Vj. = —15+1.8mV, k = 6.6+0.4 (n=5). e, Current 
structure of verapamil. b, Concentration dependence of verapamil traces of CayAb (black) and CayAb T2068 (blue) during a 1-s depolarizing 
inhibition of CayAb. The amplitude of the peak Ba** current was recorded _ pulse froma holding potential of -120 mV to —10 mV. f, State-dependent 
after applying 20 pulses at a frequency of 1 Hz, where the block reaches inhibition of CayAb T206S by Br-verapamil at 101M (black), 25 uM 
steady state. The data were fit by a Hill equation assuming a 1:1 binding (green), 50|1M (red), and 100 1M (blue). For each curve, n = 4-5 cells. 
ratio. n= 4-7 cells. ICs) = 475 +25 nM. c, Ba?* currents of CayAb T206S. 
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Extended Data Figure 7 | Comparison of dihydropyridine binding acid residues in CayAb corresponding to those that are important for 
site in CayAb and Cay1.2. The pore domain of CayAb is illustrated with dihydropyridine binding to Cay1.2 channels are highlighted in red. Bound 
two subunits in view, one in tan corresponding to domain III of Cay1.2 amlodipine is illustrated with green sticks. 


and one in blue corresponding to domain IV of Cay1.2. The amino 
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Extended Data Figure 8 | Amlodipine binding breaks symmetry. 

a, The overall structure of CayAb in complex with amlodipine (shown 

in ribbon representation). Measuring the C, distances of V196 (nearing 
the amlodipine binding pocket) from the 4 subunits shows the channel is 
asymmetrical. b, Binding of amlodipine (sticks in red) induces asymmetry 
and causes rearrangement of the lipid in the central cavity. c-f, The 
amlodipine binding pocket showing the C,-C, distance at two layers 


(Y195-G164 and 1199-F167) horizontally. At layer 1 (Y195-G164), the 
Co-Cg distance of its neighbouring sites (11.0 A in d and 11.0 A in f) 
matches the drug binding site (10.9 A in o), but the diagonal site (e) is too 
narrow (10.6 A). At layer 2 (I199-F167), the pocket width of the diagonal 
site (11.1 A in e) matches the drug-binding site (11.0 A in c), but the two 
diagonal sites are too wide (11.4A ind and 11.3 A in f). 
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green, chainA 
cyan, chainB 
purple, chain C 
Yellow, chain D 


Extended Data Figure 9 | Br-verapamil binding breaks symmetry. a, Alignment of the 4 subunits of CayAb in complex with Br-verapamil showing the 
voltage sensor module (VSD) and the ends of S6 are different. b, Measuring the C, distances between T206 residues in adjacent subunits shows that the 
channel is indeed asymmetrical with Br-verapamil in the pore. 


TMS 
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Extended Data Table 1 | Data collection, phasing and refinement statistics 


CayAb Ca,Ab" CayAb” Ca,Ab CayAb" Ca,Ab" 
(W195Y) (W195Y) (W195Y) 
UK-59811 UK-59811 Amlodipine Nimodipine Br-verapamil 
5mM Ca** 5mM Ca** 5mM Ca** 5mM Ca** 5mM Ca** 5mM Ca** 
Data collection 
Space group P21221 P21221 P21221 P21221 P21221 P21221 
Cell dimensions 
124.9 125.9 125.5 125.6 125.3 125.6 
a, b, c (A) 125.7 126.0 125.9 125.3 125.4 125.6 
191.5 192.1 191.7 191.7 191.6 192 
90 90 90 90 90 90 
a, B, v(°) 90 90 90 90 90 90 
90 90 90 90 90 90 
Resolution (A) 2:1 3.3 3.3 3.2 3.2 3.3 
Rsym OF Rmerge 11.4(98.4) 12.6(74.2) 11.7(60.1) 12.1(49.1) 11.2(68.6) 18.8(86.1) 
CC1/2(%) 99.8(87.7) 99.8(84.3) 99.7(87.4) 99.4(89.2) 99.7(86.7) 98.4(70.3) 
I/sI 13.4(2.4) 13.4(3.3) 13.1(3.1) 10.2(2.8) 15.2(3.5) 6.0(1.7) 
Completeness (%) 92.5(97.8) 95.0(100.0 92.1(82.0) 92.7(94.5) 99.8(99.8) 97.7(98.8) 
Redundancy 10.1(9.8) 9.5(9.9) 9.4(9.2) 5.1(5.2) 9.3(9.0) 4.9(5.0) 
Refinement 
Resolution (A) 30-2.7 30-3.3 30-3.3 30-3.2 30-3.2 30-3.2 
No. reflections 76513 46606 42515 46657 50390 49327 
Rwork ! Riree 22.1/26.2 28.0/30.5 27.5/30.0 23.3/27.7 21.7/25.6 25.1/29.4 
No. atoms 9684 7400 7403 7380 7393 7366 
Protein 8780 7192 7200 7192 7192 7200 
Ligand/ion 887 205 199 187 189 166 
Water 17 3 2 1 28 
B-factors 
Protein 76.9 104.4 111.4 100.9 108.0 114.1 
Ligand/ion 74.2 99.9 95.8 85.7 86.3 100.6 
Water 46.6 57.9 63.2 48.8 52.8 
R.m.s deviations 
Bond lengths (A) 0.009 0.013 0.013 0.013 0.013 0.013 
Bond angles (°) 1.15 1.74 1.75 1.73 1.54 1.74 
Ramachandran statistics 
Favored 96% 96% 96% 95.0% 96% 96.0% 
Allowed 4.1% 3.6% 3.6% 4.6% 4.0% 4.1% 
Outliers 0.28% 0.23% 0.23% 0.23% 0.12% 0.12% 


1This data set is collected at 0.9198A. 
?This data set is collected at 1.75A. 
All other data sets are collected at 1.0A. 
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CORRECTIONS & AMENDMENTS 


CORRIGENDUM 
doi:10.1038/nature18280 


Corrigendum: A novel multiple- 
stage antimalarial agent that 
inhibits protein synthesis 


Beatriz Baragafia, Irene Hallyburton, Marcus C. S. Lee, 

Neil R. Norcross, Raffaella Grimaldi, Thomas D. Otto, 
William R. Proto, Andrew M. Blagborough, Stephan Meister, 
Grennady Wirjanata, Andrea Ruecker, Leanna M. Upton, 
Tara S. Abraham, MarianaJ. Almeida, Anupam Pradhan, 
Achim Porzelle, Maria Santos Martinez, Judith M. Bolscher, 
Andrew Woodland, Torsten Luksch!, Suzanne Norval, 

Fabio Zuccotto, John Thomas, Frederick Simeons, 

Laste Stojanovski, Maria Osuna-Cabello, Paddy M. Brock, 
Tom S. Churcher, Katarzyna A. Sala, Sara E. Zakutansky, 
Maria Belén Jiménez-Diaz, Laura Maria Sanz, Jennifer Riley, 
Rajshekhar Basak, Michael Campbell, Vicky M. Avery, 
Robert W. Sauerwein, Koen J. Dechering, Rintis Noviyanti, 
Brice Campo, Julie A. Frearson, INigo Angulo-Barturen, 
Santiago Ferrer-Bazaga, Francisco Javier Gamo, Paul G. Wyatt, 
Didier Leroy, Peter Siegl, Michael J. Delves, Dennis E. Kyle, 
Sergio Wittlin, Jutta Marfurt, Ric N. Price, Robert E. Sinden, 
Elizabeth A. Winzeler, Susan A. Charman, Lidiya Bebrevska, 
David W. Gray, Simon Campbell, Alan H. Fairlamb, 

Paul A. Willis, Julian C. Rayner, David A. Fidock, 

Kevin D. Read & Ian H. Gilbert 


Nature 522, 315-320 (2015); doi:10.1038/nature14451 


In this Article, Torsten Luksch was inadvertently omitted from the 
author list. He is affiliated with the Drug Discovery Unit, Division of 
Biological Chemistry and Drug Discovery, College of Life Sciences, 
University of Dundee, Dundee DD1 5EH, UK. His contribution was 
the analysis of initial screening data alongside author A.W. The online 
versions of the paper have been corrected. 
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CORRECTIONS & AMENDMENTS 


CORRIGENDUM 
doi:10.1038/nature18623 


Corrigendum: Robust neuronal 
dynamics in premotor cortex 
during motor planning 


Nuo Li, Kayvon Daie, Karel Svoboda & Shaul Druckmann 


Nature 532, 459-464 (2016); doi:10.1038/nature1 7643 


We would like to correct several minor errors in this Article. In the Fig. 2b 
legend, “*P < 0.01’ should have read ‘**P < 0.01. In the Methods 
‘Photoinhibition’ section, the description of galvo step time during 
photoinhibition of multiple cortical locations should have read ‘step 
time: <0.2 ms; dwell time: >4.8 ms’ instead of ‘step time: <4.8 ms;. 
In Extended Data Fig. 4c, the y-axis values of the three bottom panels 
should have run from —1 to 1, instead of from 0 to 2. In the Extended 
Data Fig. 5a legend, the citation given for TIx_PL56-Cre mice should 
have been to ref. 46, rather than ref. 50. In the Extended Data Fig. 6b 
legend, the sessions were incorrectly referred to as ‘lick-right trials 
(session 1, 4)’ and ‘control lick-right trajectories (session 2, 3, 5)’ 
instead of ‘lick-right trials (session 1, 3, 4) and ‘lick-right trajectories 
(session 2, 5). All of these errors have been corrected online, and none 
of them affects the description, interpretation or conclusions of the 
Article. 


122 | NATURE | VOL 537 | 1 SEPTEMBER 2016 
© 2016 Macmillan Publishers Limited, part of Springer Nature. All rights reserved. 


CORRECTIONS & AMENDMENTS 


CORRIGENDUM 
doi:10.1038/nature18937 


Corrigendum: Convection in a 
volatile nitrogen-ice-rich layer 
drives Pluto’s geological vigour 


William B. McKinnon, Francis Nimmo, Teresa Wong, 

Paul M. Schenk, Oliver L. White, J. H. Roberts, J. M. Moore, 

J. R. Spencer, A. D. Howard, O. M. Umurhan, S. A. Stern, 

H. A. Weaver, C. B. Olkin, L. A. Young, K. E. Smith & the New 
Horizons Geology, Geophysics and Imaging Theme Team 


Nature 534, 82-85 (2016); doi:10.1038/nature18289 


In the list of the New Horizons Geology, Geophysics and Imaging 
Theme Team, two members were inadvertently omitted: Richard P. 
Binzel and Alissa Earle (both affiliated with Massachusetts Institute of 
Technology, Cambridge, Massachusetts 02139, USA). In addition, the 
vertical scale in Fig. 3 should be read in metres, not kilometres. These 
errors have been corrected in the online versions of the Letter. 
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THE PROJECT TWINS 


TOOLBOX 


COMPUTERS 
ON THE REEF 


Tools that analyse underwater images of the world’s 
coral reefs are transforming marine ecology. 


BY JEFF TOLLEFSON 


lue spires seem to pop out of the 
Botereson in one place; a patch of 

bushy forms of pinkish-purple in 
another. To the untrained eye, each is distinct 
and clearly a coral. Then, Manuel Gonzalez- 
Rivero points to a third cluster. The bulbous 
shapes look like coral, except the smooth 
grey surface isn't quite right. “The texture 


and colour suggest that you are probably not 
looking at a coral,” he says. “More likely you 
are looking at a crustose coralline alga” 

The image is a high-resolution panoramic 
photograph collected by the XL Catlin Seaview 
Survey, a scientific initiative that began to cata- 
logue the world’s reefs in 2012. To understand 
how coral reefs are responding to overfishing, 
pollution, global warming and ocean 
acidification, the Catlin team — ecologist 


Gonzalez-Rivero among them — is document- 
ing coral abundance, health, structure and bio- 
diversity in millions of underwater snapshots. 

It would take decades to go through all 
these images manually, even with Gonzalez- 
Rivero’s expert eyes. But the Catlin team is 
using a neural-networking algorithm: a deep- 
learning system in which a computer learns 
to classify what it sees in coral-reef pictures. 
The project was led by computer scientist > 
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» Oscar Beijbom at the University of Califor- 
nia, Berkeley, and the software can zip through 
Catlin’s gigantic photo album — currently 
around one million photographs — in a mat- 
ter of months. 

The software is just one example of how 
coral researchers are embracing advances in 
computer science and software to speed up 
under-sea mapping of reefs around the world. 
Combined with high-quality imagery and sen- 
sors that collect standardized biological data 
about reefs, these tools could unleash an era 
of semi-automated data collection and moni- 
toring, freeing up ecologists to spend less time 
processing data and more time doing research. 

“Tt’s a tremendous step forward, says Mark 
Eakin, who manages the Coral Reef Watch 
programme for the US National Oceanic 
and Atmospheric Administration (NOAA) 
in College Park, Maryland. “When you aren't 
limited by the speed of people going through 
and manually processing images, the yield of 
information is just so much greater.” 


OCEAN OF DATA 

Coral researchers’ entry into the world of big 
data comes none too soon. Long limited by the 
size of their diving fins and the capacity of their 
oxygen tanks, marine ecologists are racing to 
expand their surveys to document and under- 
stand the longer-term impacts of rising ocean 
temperatures and acidification. The bleaching 
of corals around the world that has accompa- 
nied the epic 2015-16 El Nifio warming event 
in the tropical Pacific Ocean has only height- 
ened concerns. 

Gonzalez-Rivero’s goal is to cover as much 
territory as possible to get a sense of how dif- 
ferent corals and reefs are responding to these 
stresses. Computers will never replace the 
human eye, nor will they obviate the need for 
detailed underwater investigations and labora- 
tory research, but they can speed up the basic 
surveys, he says. “What we are trying to do is 
find a compromise where we get enough infor- 
mation to understand the reef, but at a much 
faster pace and in a much cheaper way. 

The quality does not have to be compro- 
mised: according to Beijbom’s unpublished 
results, the deep-learning system agrees with 
the human eye on features in coral photos 
about 81% of the time — impressive consid- 
ering that even two experts are likely to agree 
only 84% of the time. 

Beijbom plans to launch the algorithm in a 
few months’ time for anyone submitting pic- 
tures to his website CoralNet, which already 
uses computer-assisted systems to help the 
automated analysis of images. The service 
is free thanks to funding from NOAA, and 
420 users from a variety of institutions, includ- 
ing NOAA, have already uploaded nearly 
269,000 images to the site. The best results 
seem to come from use of a semi-automated 
program in which the computer does simple 
analyses and alerts human experts to cases that 


it’s not confident about, Beijbom says. 

In many ways, Gonzalez-Rivero says, marine 
science is catching up with the terrestrial sci- 
ences, which have been developing tools to 
gather and process copious amounts of data 
from satellites and aircraft for decades. The 
software and hardware can't be directly trans- 
lated to analysing seas, however: the ocean 
swallows light, so it is difficult to study anything 

but the shallowest 


“We need reefs from above. 
an army of That has pushed 
people making coral researchers 


to adapt the tools. 
At Michigan State 
University in East 
Lansing, for exam- 
ple, biophysicists David Kramer and Atsuko 
Kanazawa have modified a handheld sensor 
originally designed for agricultural research. 

When used on land, the sensor measures 
information such as fluorescence in plants, 
the carbon content of the soil, the temperature 
of the air and the humidity. Around 300 sen- 
sors are in use in 18 countries, and every time 
a researcher or a government official takes 
a reading, the data are uploaded to a central 
server for analysis. 

The modified system, dubbed CoralspeQ, 
pings reefs with different kinds of light and 
records the returning spectral signal in 
256 wavelengths, from ultraviolet to infrared. 
These data can be used to measure a reef’s pho- 
tosynthetic activity, for instance, by measuring 
the fluorescence of chlorophyll in symbiotic 
algae that provide their host corals with oxygen 
and nutrients. Knowing how much photosyn- 
thetic activity is taking place, and where, could 
help researchers to identify stressed systems, 
Kramer says. 

The devices use commercially available 
sensors and are built with the help of 3D print- 
ers. Kramer and Kanazawa hope to bring down 
the cost of the underwater version from its 
current US$500 and get it into the hands of as 
many scientists as possible. “We need an army 
of people making high-quality measurements,” 
Kramer says. 


high-quality 
measurements.” 


COMPUTER-ASSISTED VISION 

Marine microbiologist Arjun Chennu has 
developed an underwater imaging system to 
collect even more detailed data across a greater 
radiation spectrum. Coral ecologists then 
annotate the images, and the information is fed 
into a neural-network algorithm that is based 
on open-source machine-learning software 
and is similar to the one developed by Beijbom. 
The machine's ‘hyperspectrum’ means that it 
can capture much more information than can 
the human eye, says Chennu, who works at 
the Max Planck Institute for Marine Micro- 
biology in Bremen, Germany. This makes it 
easier to differentiate between corals that look 
similar in standard images. “For example, we 
resolve the often-used ‘other coral’ categories 
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into their proper taxonomic types, and also 
include sponges, macroalgae and seagrass in 
our predictions,” he says. 

Others have adapted commercially available 
software that is already used to map land- 
scapes and analyse landslides by overlapping 
2D images into 3D models. PhD student John 
Burns at the University of Hawaii at Manoa’s 
Institute of Marine Biology uses a program 
called Agisoft PhotoScan, which costs $549 
for an educational licence for the professional 
edition. Free software is available, but it is less 
sophisticated, Burns says. 

The models — which can achieve a resolution 
of just 1 millimetre when used with good cam- 
eras — can be analysed by people or computers 
to identify coral species and quantify reef cov- 
erage. But, because they’re 3D, they can also be 
used to track structural changes as reefs bleach 
and break down owing to high ocean tempera- 
tures — anew kind of ecological information. 

For Burns, the beauty of the method is its 
simplicity: data can be collected quickly and 
with minimal training. “This method just lets 
you take hundreds of thousands of single-lens 
images with your camera, and then you are 
essentially stitching them together,” he says. 


STANDARDIZED STORE 

Sophisticated technologies aren't the only 
answer, says Emily Darling, a marine ecolo- 
gist with the Wildlife Conservation Society 
in New York City. Because separate research 
efforts are collecting ever-greater quantities 
of data on coral reefs, it is important that they 
collect standardized data sets and store them in 
a repository that can be accessed by the entire 
community. 

In an effort to collect systematic data on 
the recent global bleaching event, for exam- 
ple, Darling and her colleagues came up with 
avery simple technology — an Excel spread- 
sheet that scientists around the world can use 
to register various data on reef conditions. The 
value is that when scientists come out of the 
water, they can immediately import and ana- 
lyse their data, and Darling now has uniform 
results from more than 61,000 reef colonies in 
13 countries. Roughly 58% showed bleaching. 

Ultimately, Darling says, coral ecologists 
need to converge on some kind of a central 
repository for the full suite of information 
that they are collecting around the world. “We 
need places where data are accessible, where 
they are telling stories, and where people can 
go and figure out whether conservation actions 
are working or not,” she says. “We need to be 
able to answer those questions a lot faster.” m 


CORRECTION 

The story ‘The paper promoters’ (Nature 
536, 113-114; 2016) should have made 
clear that Altmetric.com collects data from 
both mainstream and social media. 
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| ART-SCIENCE COLLABORATIONS | 
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Joe Gerhardt, one-half of the UK artist duo Semiconductor, explores the archives at CERN with archivist Anita Hollier as part of the COLLIDE initiative. 


Change of perspective 


Pick up a lump of clay or stare at a Leonardo water drawing — your science, not just your 
frame of mind, will benefit from it. 


BY SHEILA MULROONEY ELDRED 


fter earning her PhD in Earth and 
Ans sciences, Johanna Kieniewicz 

found herself in a coveted tenure-track 
job. But as she dug more deeply into her work, 
she felt her field of vision narrowing — and not 
in a good way. Extreme focus left her worried 
that she was stifling her creative side. 

“With the intensity of those sorts of jobs, it 
becomes all that you do,’ she says. “I was in dan- 
ger of losing the bigger picture.” To re-engage 
with her artistic side — she had always had a 
penchant for drawing and making things with 
her hands — she took a leave of absence and 


went to art school. There, she came to realize 
how skills taught in the art world could influ- 
ence science. Asking difficult questions about 
purpose and ethics, or imagining both fantas- 
tic and terrifying futures, helps scientists to put 
their work in perspective, she says. She used 
her art experience to nab a dream job as head 
of outreach and engagement at the Institute of 
Physics in London, where she coordinates with 
art museums and theatres to pull the public into 
conversations about science. “Ultimately, both 
artists and scientists are asking big questions 
about the world,’ Kieniewicz says. “A lot of rich 
and exciting stuff is happening between them” 

Although Kieniewicz took her affinity for 


art to the far end of the spectrum, attending art 
school is hardly a prerequisite for those who 
hope to expand their scientific horizons and 
frame an experiment differently or get past a 
sticking point. Even a rudimentary interest in 
art can help to shift a researcher's perspective. 
Routes into the realm include creating your own 
art, collaborating with artists and viewing art 
that resonates with you. 

Making art can be very helpful for scientists 
when they are failing to make progress. “Some- 
times you have to dive in deeply, but sometimes 
youre stuck and have to get unstuck,” says 
Robbert Dijkgraaf, director of the Institute for 
Advanced Study in Princeton, New Jersey. 
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> He advises his students to engage in some 
form of art when they encounter seemingly 
insurmountable obstacles in their research. 

Cancer researcher Silvia Balbo relates to 
that recommendation. She has access to an art 
studio for precisely that purpose. It’s been her 
escape ever since she tooka sculpting class in 
high school in Turin, Italy. “Whenever I feel 
like things are stuck, I go back to it,” she says. 
She made use of the studio many times during a 
particularly gruelling three-year project on how 
tobacco smoke and alcohol damage DNA and 
contribute to cancer. 

Balbo would often head, exhausted, directly 
from her lab at the University of Minnesota in 
Minneapolis to the studio. “Id be super tired, 
but I'd get there and then suddenly feel super 
energized,” she says. “On those days where you 
feel like you haven't accomplished anything, it’s 
nice to get a feel for making something. I picked 
clay because it’s constructive: all of a sudden, I 
have a piece. That immediate outcome is very 
rewarding” 

Clay modelling also gives her a chance to turn 
off the structured, analytical part of her brain, 
she says, and allow intuition and creativity to 
take over. Often, she leaves the studio with a 
fresh outlook on a knotty experiment. “Ill get 
out of there and realize, ‘Oh, I had not thought 
of it in this way before,” she says. 

Over the course of that project, Balbo 
sculpted, fired and glazed four pieces: nude 
women in various languid postures drenched 
in streaks of blue glaze that she now displays 
in her home. Ultimately, her team had a break- 
through, and published the findings. In addition 
to unlocking new ideas in the lab, she credits the 
sculpting with helping her to stay on track. “It’s 
very energizing to havea peek into the art world 
and recharge your batteries,’ she says. 


EYE OF THE BEHOLDER 
The pay-offs of art involvement need not come 
just from creation. Simply looking at it can 
also bring benefits: gazing at other people's 
creative endeavours can help scientists to find 
inspiration and come up with new approaches. 
Chemist Catherine Murphy at the University 
of Illinois at Urbana-Champaign is drawn to 
close-ups of natural objects, the bright col- 
ours of inorganic compounds and the brilliant 
hues of gold nanomaterials. She has a copy of 
artist Georgia O’Keeffe’s Red Poppy No. VI on 
her office wall just so that she can stare at the 
flower’s vibrant scarlet petals. She once bought 
a painting at an art fair that looked to her like 
proteins seen through an atomic force micro- 
scope (not what the artist had in mind, she says). 
“T thought it was really interesting that the same 
visual could be perceived in so many different 
ways,’ she says. “In science, the more differ- 
ent perspectives you have on the phenomena 
you're studying, the richer the understanding 
becomes.’ 

Other ways to stretch scientific thinking are 
discovered when researchers collaborate with 


RESOURCES 


Up your art quotient 


Scientists who have no experience in art can 
still find ways to engage with their creative 
side. “If you have any curiosity for art, I’d 
encourage you to give it a try,’ says Silvia 
Balbo, a cancer researcher at the University 
of Minnesota in Minneapolis. 

@ Make a friend in the art department of 
your institution. Go to a thesis presentation 
and invite an art student to visit your lab. 

@ Apply for a residency or offer to 
participate in one. Here are a few: the 
Massachusetts Institute of Technology 
Center for Art, Science & Technology in 
Cambridge (go.nature.com/2bnppjh), 
Arts@CERN in Geneva, Switzerland 


artists. So effective have these partnerships 
been for stimulating scientific creativity that 
some research institutions have established pro- 
grammes to encourage them (see “Up your art 
quotient’). Europe's particle-physics laboratory, 
CERN, for example, established a programme 
called COLLIDE to foster ingenuity through the 
exchange of ideas between scientists and artists. 
The initiative brings world-class artists to the 
laboratory and campus in Geneva, Switzerland, 
for a residency of up to three months. 

CERN theoretical physicist Luis Alvarez- 
Gaumeé (who moves to Stony Brook University 
in New York this month), recently worked with 
two UK artists as part of the initiative. The art- 
ists used scientific 


data and computer- “The artist will 
generated animation © ome inf roma 
to probe how sci- _ bit of atangent, 
entific instruments probing areas 
and discoveries in where scientists 
particle physics influ- wouldn’t think 
ence the perception to probe. si 


of nature. Explain- 

ing his work to them for their upcoming piece 
helped Alvarez-Gaumé to find holes in his own 
knowledge. “It allows us to really see, to appreci- 
ate and understand what we are talking about,’ 
he says. Kieniewicz agrees that working with 
artists helps scientists to reframe their thinking. 
“The artist will come in from a bit of a tangent, 
probing areas where scientists wouldn't think to 
probe,’ she says. “They are really good at asking 
‘what if’ questions — ‘what if we could hear the 
Higgs boson?’” 

Art-science collaborations can produce other 
benefits, too. Murphy established a programme 
at her lab in which university art students come 
in and ask questions of her chemistry pupils. 
She quickly realized that her students rapidly 
improved at communicating their work and 
ideas. “When youre giving a presentation toa 
totally non-scientific audience, you have to be 
able to communicate really well,” she says. 


126 | NATURE | VOL 537 | 1 SEPTEMBER 2016 
© 2016 Macmillan Publishers Limited, part of Springer Nature. All rights reserved. 


(go.nature.com/2b5b9jb), The 
Guapamacataro Center for Art and Ecology 
in Michoacan, Mexico, and the Institute for 
Advanced Study in Princeton, New Jersey 
(http://go.nature.com/2brypgx). 

® Participate in a collaboration such as 
those organized by the Institute of Physics 
in London (go.nature.com/2b9ycel). 

@ Sign up for a drawing or pottery class 
through community education, audit a class 
at your university or search for museum- 
based programmes. 

@ Search Meetup.com for art-related 
outings. 

@ Search Twitter for #sciart. S.MLE. 


And scientists are often awestruck by seeing 
artists portray what they've learned in com- 
pletely new ways. Artists who have worked 
alongside Murphy’s students, for example, have 
created everything from a dance interpreting 
the view through an electron microscope to a 
computer-sized block of canvas with light bulbs 
shining through at various levels of brightness, 
inspired by the gold particles that the artist 
glimpsed through a microscope. Because the 
results are usually exhibited to both scientists 
and artists, they provide an ideal opportunity 
for interdisciplinary conversations. 

The collaborations spawn more than 
impressive art — they are rich for researchers 
too, says Martin Kemp, an emeritus art historian 
at Trinity College in Oxford, UK, who special- 
izes in visualization of science and has written a 
book called Structural Intuitions: Seeing Shapes 
in Art and Science (Univ. Virginia Press, 2016). 
He says that perception is deeply embedded in 
the brain by the end of formal schooling, yet 
researchers must embrace other ways of think- 
ing and visualizing, and can do so through mak- 
ing or viewing art. He thought he was leaving 
science forever when he went to do graduate 
studies at an art institute — until he stumbled 
across Leonardo da Vinci's water drawings. 
The detailed sketches depicting patterns and 
shapes of water, wind and air reflect the theory 
of hydrodynamics, he says — completely appli- 
cable to both art and science. “I felt I'd come 
home,’ Kemp says. 

Although that sort of leap is practical for 
very few (“It doesn't help to have art school on 
your CV to get funded,’ Balbo says), most of the 
bright scientists Kemp knows engage in the arts 
in some form. And some even say it is essential 
to their careers. 

“Tf Thad not gone to art school, I don’t believe 
I would be a scientist today,” Dijkgraaf says. m 


Sheila Mulrooney Eldred is a freelance 
writer in Minneapolis, Minnesota. 
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irsten knocked. The carved ebony 

knocker landed soundlessly on velvet. 

The door swung open and she 
entered the dark hallway. The door closed, 
not with a sound but with an intensification 
of the air inside, making it dryer, darker and 
more silent. 

She waited until her eyes 
had got used to the dark- 
ness. Sometimes she felt 
anticipation, sometimes 
dread. Often she felt noth- 
ing much. She appreciated 
the efforts of her hosts, but 
gratitude no longer had 
the power to mitigate the 
realization that she'd never 
get out of here. As the only 
survivor of the crash, what 
she missed most was other 
humans. 

A light glowed up at 
the end of the hallway. 
The layout of the facility 
changed with every visit. 
Surprise and curiosity 
were emotions as well, and 
again, they did their best. 

Kirsten started walking, 
reluctant to let go of her 
anticipatory state. Her feet 
touched warm, non-reso- 
nant flooring that came close to being wood. 
Close enough for her soles, if not for her ears 
and sense of smell. Even though her mind 
was sinking into despair, her body could still 
react to small pleasures. She rolled her soles 
over the floor, savouring the moment, wish- 
ing she could store the sensation. 

A seven-foot dragon burst from the ceil- 
ing and shrieked in her face. 

Kirsten’s heartbeat accelerated slightly. If 
only she could still experience the full range 
of emotions. Her mind knew this was a side 
effect of depression, but that didn’t seem to 
change anything. The dragon flapped its 
kimono wings and stomped its booted feet. 
The bright reds and yellows of the dress, the 
gyrations and dips of the dance were gor- 
geous. As were the tinkling, winking head- 
dress decorations, like seahorse antennae. 
Kirsten clapped politely. The unmoving 
mask turned in her direction, the figure took 
a bow and vanished. 

Beauty and grace moved her only a little. 
Her hosts had tried ugliness and stenches 


Asense of loss. 


and danger to stir her, and that had worked 
in the beginning. But everything palled 
eventually, so they had reverted to recreating 
familiarity. Although Noh hadn't been in her 
cultural vocabulary back on Earth. 

Fuzzy blackness descended over her head. 
Kirsten fought, from real surprise, but her 
captors jostled her, rolled her over, carried 


her somewhere while she bounced on their 


bodies, was banged against obstacles and felt 
sick to her stomach. Surprise? Check. Dis- 
comfort? Check. Nausea? Bruising? Even 
unpleasant sensations were welcome. 

She was squeezed out of her sack, the 
clothes ripped from her body, rolled around 
in scalding water and beaten with something 
like rubber hose. Only it wouldn't be rubber, 
or hose. Probably her hosts’ tentacles. Just 
being aware of that possibility made it feel 
less real. 

Gentle touches soothed her skin with 
water almost smelling like jasmine. So close. 
How did her nose even know it wasn't the real 
thing, after so long without it? Maybe she was 
hallucinating and that was why nothing felt 
quite real. It was as if the real Kirsten, the one 
with depth and warmth, was just behind a 

curtain, tantalizingly 
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The aliens lowered Kirsten into a warm 
mud bath, which was a regular occurrence 
during the ritual. It was the one thing 
they got right, simply because shed never 
had one back home. Submerged young- 
lings massaged her sore body with their 
tentacles. 

Something warm touched her shoulders. 
She leaned forward so 
they could massage her. 
Warm, slightly rough 
fingers dug into her tight 
shoulders in just the right 
places. How had they 
learned this? She imag- 
ined tentacled beings 
practising on each other’s 
non-existent shoulder 
muscles, or on Kirsten 
dolls. She smiled. That 
was a good feeling, even 
if her smile muscles tired 
quickly. 

Then a sensation 
ghosted over her back, 
as if the masseur’s hairy 
underarm accidentally 
touched her back. Her 
body flooded with adren- 
aline, the hair rose on her 
neck, her breath caught 
in her throat. The touch 
of another human being’s 
skin. The thing she'd 
missed more than words. 

“Who are you?” the real Kirsten said, 
sharply. 

She must have startled the masseur, 
because she glimpsed a flash of an old cash- 
mere sweater of hers it held in two tentacles. 
The illusion shattered. 

She cried so hard it hurt. For that one 
moment she'd been back to her old self 
again, but it was worse than feeling every- 
thing through a veil. She was going to die 
alone. Without ever having touched another 
human being again. 

She rose from the mud-bath, drying her 
tears. No more rituals. She never wanted to 
experience another moment of hope like 
that. It hurt too much. = 


Bo Balder is the first Dutch author to be 
published in Fantasy & Science Fiction. 
Her short fiction has appeared in Crossed 
Genres, Futuristica Volume 1 and other 
venues. Her SF novel The Wan is published 
by Pink Narcissus Press. 
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